Web Development Archives - Matthew Gove Blog

Travel Bloggers: How to Stand Out with Powerful Interactive Maps

Matt Gove — Fri, 04 Feb 2022 16:00:00 +0000

Interactive maps are one of the most powerful tools available to travel bloggers today. Unfortunately, far too many travel bloggers today are either using interactive maps incorrectly or not using them to their full potential. As a GIS expert, data scientist, and travel blogger myself, it pains me greatly when I see so many crappy maps on travel blogs. There is so much potential going to waste. As a result, I want to help you, the travel blogger, realize the full potential of your interactive maps. At the very minimum, you’ll see more traffic, retain more visitors, and make your website easier to navigate.

What Makes a Good Interactive Map?

Regardless of what industry you’re in, a good interactive map is easy to understand, offers an intuitive user experience, and draws the user’s focus to the data on the map. One of the best examples of an interactive map is the RadarScope application, which we covered in detail last month in 6 Powerful Weather Applications for Stunning Landscape Photography. If you’re not familiar with it, RadarScope is an application that plots weather radar data, severe weather warnings, and much more on a map.

So how exactly does RadarScope do it so well? When I look at the maps in the screenshots above, I make a few key observations about what makes it such a powerful interactive map, even without having access to its interactivity.

Your eye is immediately drawn to the radar data on the screen.
The basemap is simple enough that you instantly know where the storms are, while at the same time fading into the background and not distracting you from the data on the map.
RadarScope’s chosen color scheme is easy to understand because it’s the industry standard for the weather and meteorology field.
The design of RadarScope’s user interface is timeless. The last two screenshots in the above gallery are from the El Reno (31 May) and Moore, Oklahoma (20 May) tornadoes in 2013. The remaining screenshots were taken between 2017 and 2021.

You Can Bring That Same User Experience to Your Travel Blog Without Breaking the Bank

Thankfully, there are plenty of tools and applications available today to bring a similar mapping experience to your travel blog. Best of all, many are free and open source, so you don’t have to spend your hard-earned cash on expensive licensing fees. However, before we look at solutions, let’s have a look at the problem.

Too Many Travel Bloggers’ Interactive Maps are Not Actually Fully Interactive

In order to fully understand the problem so many travel bloggers run into, let’s first look at the definition of interactive.

Interactive: allowing a two-way flow of information between a computer and a computer-user; responding to a user’s input.
Oxford Dictionary

Armed with that definition, have a look at what you’ll find on far too many travel blogs. From the home page, you click on a link or button to the interactive map. You see a simple map that looks something like this screenshot. Some of the countries may be shaded to indicate that the blogger has traveled there.

You’ll find maps that look like this on a lot of travel blogs

If you hover over a country, it’ll often show the country’s name and maybe a count of the number of blog posts, photos, or videos the travel blogger has created. However, if you click on a country, you’ll just be brought to another page with a list of post titles. In the best case scenario, you’ll also see a featured image and the first 20-40 words of the post, much like our All Posts page.

However, this still leaves me asking one question. If you have a map whose only purpose is to redirect visitors off of said map, why even have the map at all?

How Can Travel Bloggers Make Their Maps Interactive?

First and foremost, if you’re going to add a feature to your blog or website, it should serve more of a purpose than just redirecting visitors off of it. Your visitors should be able to gather all of the information they need without leaving the map. There is one exception, however. If you’re trying to display lengthy content, such as an entire blog post, don’t try to put the entire post in a pop-up window on the map. Nobody in their right mind is going to scroll through all of that.

Instead, you want to include key details and a link to the full blog post to make a fully interactive map. For a blog post, you’ll want to include at least four things.

Title of the post
A summary of the post or the first 20-25 words of it
The post’s feature image
A link to the full post
Anything else that’s relevant and important, such as the date, author, or location

Example: The Matt Gove Photo Visual Media Map

To best demonstrate a fully interactive map, let’s have a look at the Matt Gove Photo Visual Media Map. Instead of blog posts, the map includes datasets to display all of our photos and videos on a map. Notice how it includes all five requirements for full interactivity. As a user, you can easily explore our photos and videos on a map without having to leave the map. In addition, you can click on the link in the pop-up window to view the full album or video.

Best GIS Solutions for Travel Bloggers to Create Interactive Maps

As recently as 10 years ago, getting professional-quality interactive maps meant shelling out hundreds, if not thousands of dollars every year in GIS software licensing fees. Even worse, there were very few online GIS programs to choose from back then.

Thankfully, that has all changed. Today, you there are an plenty of free online mapping programs available. Many of these programs are both incredibly powerful and easy to set up. While it’s certainly not a requirement, I highly recommend investing in a developer to install, connect, and integrate the maps on your website. They’ll be able to connect your maps directly to your database. As a result, new content will automatically be added to your maps, allowing them to effortlessly scale with your business. It’ll cost more up front, but with a good developer, you’ll save money in the long run.

We’d love to help you get set up with your maps. If you want to discuss your project further or get a free quote to add maps to your travel blog, please get in touch with us today. Or if you’re still just kicking tires, please feel free to browse our catalog of GIS products and services.

Google Maps or Bing Maps

If you’re just starting out or are looking for a simple solution, Google or Bing Maps are great options! Both platforms allow you to create high-quality maps and embed them on your website free of charge. And best of all, you don’t need a developer. Instead, you’ll just copy and paste a short block of code into your website or blog.

Unfortunately, the simplicity of both Google and Bing Maps leave them with some downsides. Most notably, if you have a dynamic dataset or high traffic volumes, you’ll run into issues. Neither platform is built to display large datasets. As a result, you’ll struggle to scale your application and will run into high API fees to generate the basemaps as your organization grows.

Mapbox

Mapbox is a direct competitor of both Google and Bing Maps, but it offers so much more. As a result, I typically recommend that my clients use Mapbox over Google or Bing. Mapbox gives you much finer control over your mapping application, allowing you to scale it both up and down as your business evolves. And best of all, it’s completely free to use, unless you have more than 50,000 map loads per month.

Screenshot of a Simple Mapbox Map of Boston, Massachusetts

However, my favorite Mapbox feature is its use of vector tiles, which allows you to display huge amounts (read: gigabytes) of data that load and respond extremely fast.

Fully customizable maps
Much friendlier pricing than Google or Bing
Supports both proprietary and open source data formats
Offers a software development kit to add your own functionalities and customizations
Developer optional to use maps or embed them on your website

Leaflet

Leaflet is a simple, yet powerful open source JavaScript library that displays two-dimensional maps on your website or blog. Because it’s hosted on your server, you don’t need to worry about API fees, regardless of how much or how little traffic you get. It’s lightweight, fast, powerful, and completely customizable. Furthermore, it has an extensive library of plugins and extensions if you need additional functionality. If you don’t know JavaScript, you’ll need a developer for the initial set up of your maps. You can easily connect Leaflet to nearly any type of database or data repository. As a result, Leaflet maps will easily scale with your business once it’s set up.

Screenshot of a Data Point on a Leaflet Map

Thankfully, Leaflet requires very little maintenance once you get it up and running. In fact, I tell my clients that it’s often more cost-effective to pay for Leaflet maintenance as you need it instead of paying a recurring monthly maintenance fee. Yes, there are obviously exceptions to that rule. However, for the vast majority of people, Leaflet is an extremely cost-effective way to add high-quality maps to your website or blog.

Cesium

Like Leaflet, Cesium is an open source JavaScript library that creates powerful 3D maps to “unleash the power of 3D data”. Their maps are engineered to maximize performance, precision, and experience. With Cesium, it often feels like there is no limit when it comes to what you can do with a map. In fact, Cesium also includes a timeline, so you could make the argument that its maps are four dimensional instead of three.

Furthermore, they’ve even created their own 3D vector tile format that lets you load enormous datasets in seconds. For example, check out Cesium’s demo that loads 3D models of nearly every building in New York City. It’s fast, fluid, and responsive. For additional demos, have a look at Cesium’s Use Cases. You’ll find examples from many different industries, applications, and regions.

Someone even built a flight simulator on the Cesium platform

You can get an incredible amount of power and functionality out of Cesium’s free base functionality. For the average travel blog, the free functionality is probably more than you need. However, if you want to harness its full potential, you should at least look into some of the paid add-ons for Cesium. Those paid add-ons will streamline your workflow and optimize your data. As a result, your users will ultimately have a better experience.

Terria

If you’re trying to decide between Leaflet or Cesium, why not use both? Originally developed as an open source platform for the Australian Government, Terria lets your users choose whether they want to view a two or three dimensional map. And you can probably see where this is going. Leaflet powers Terria’s two dimensional maps, which Cesium is behind its three dimensional maps.

The best feature of Terria, however, is its user interface. Easily organize and browse through a huge number of datasets. It uses Cesium’s 3D data optimization to ensure your map remains fast and responsive, even if your datasets are massive. Use Terria’s slider to compare datasets side-by-side. It even includes a feature for you to build stories with your data and share them with your audience.

COVID-19 Dashboard
Scenic Drives Repository and Guide

All of our mapping applications are built using Terria

I use Terria for all of my mapping needs, and also recommend it for most of my clients. Its power and responsiveness across all devices, including mobile phones, coupled with its flexibility and minimal programming required to set it up make it the optimal platform for me. My users and clients have never complained about being confused using Terria, and are often impressed at how easy it is to analyze huge amounts of data. And best of all, I can set it up so it scales up and down as I need without needing hardly any maintenance.

If you want your travel blog to stand out from the rest, adding fully interactive maps with Terria is one of the easiest and most cost-effective way to do so. To learn more or get started, please get in touch with us or browse our online resources.

ESRI ArcGIS Online

If you have a complex dataset, but would prefer not to hire a developer, ESRI’s ArcGIS Online may be the best solution for you. Yes, it does have licensing fees, but you’ll get much of the functionality of the other applications we’ve discussed without needing a developer to set them up for you. Like the other platforms, ArcGIS online can easily handle large numbers of complex datasets, and plot those data on maps that need just a copy and paste to embed in your website. Plus ESRI is widely considered to be the industry standard for anything related to GIS and maps. If anything goes wrong for you, they have excellent documentation and support.

If you’re looking for a real-world example of ArcGIS Online in action, you’ve probably seen them already. Since the COVID-19 pandemic began, most dashboards that display maps of COVID-19 data use ArcGIS online.

ESRI ArcGIS Online powers many COVID-19 Dashboards, including the one by John’s Hopkins University

Summary

Platform	Free	API Fees	Dimensions	Developer	Dynamic Data
Google/Bing Maps	Yes	Optional	2D Only	Not Required	No
Mapbox	Yes	> 50K loads/mo	2D Only	Optional	Yes
Leaflet	Yes	Not Required	2D Only	Required	Yes
Cesium	Yes	Not Required	3D Only	Required	Yes
Terria	Yes	Not Required	2D & 3D	Required	Yes
ArcGIS Online	No	N/A	2D Only	Not Required	Yes

Travel Bloggers, Take Your Interactive Maps to the Next Level

Once you have your new interactive map set up on your website or blog, simply adding a link to it is not enough. In addition, you should strategically embed them on different pages of your website to give your users the most immersive experience. For example, you’ll find the Matt Gove Photo Visual Media map embedded both on the home page of this blog and the main photo gallery page on the Matt Gove Photo website, in addition to being embedded above. I’ll continue to add maps as we go forward, too.

You’ll find the Matt Gove Photo Visual Media Map strategically embedded across our websites

To figure out where to embed your interactive maps, have a look at your website’s analytics. Are there pages that have a lot of page views? Is there a specific page people are navigating to that the map could benefit? Are your visitors getting confused and leaving your website instead of navigating somewhere the map could help? Is there a logical place for your maps in your navigation or sales funnel?

Finally, as a travel blogger, you shouldn’t plot just your blog posts on the interactive map. Geotag photo albums, videos, social media posts, guides, fun activities, scenic drives, and much more. Don’t be afraid to make multiple maps, either. Try to use a platform like Terria or ArcGIS Online that organizes your datasets in a logical manner and makes it easy to both add and remove data from the map. If that’s not an option, don’t overwhelm your users with too much data on a single map. That’s one of the best ways to drive visitors off of your website and directly into the arms of your competitors.

Conclusion

Fast, professional quality interactive maps are one of the best ways travel bloggers can stand out from the crowd. Interactive maps are easy and cost-effective to implement and maintain. They’re also incredibly effective at retaining your visitors’ engagement and keeping them on your website. It boggles my mind why so many travel bloggers haven’t taken full advantage of the incredible potential interactive maps present to both grow your audience and keep your existing followers coming back for more.

Are you ready to take the next steps with interactive maps and bring your website or travel blog to the next level? As avid travelers and data science experts who specialize in online GIS and mapping applications, we’d love to help you take that next step in your journey. I invite you to please browse our catalog of GIS and mapping services. Then, get in touch or book a free info session with us to discuss your specific project. We can’t wait to hear from you.

Top Photo: Chapman’s Peak Drive on the Matt Gove Photo Scenic Drives Map
Cape Town, Western Cape, South Africa

The post Travel Bloggers: How to Stand Out with Powerful Interactive Maps appeared first on Matthew Gove Blog.

Does Your Website Make These 10 Mistakes with Hero Images?

Matt Gove — Fri, 27 Aug 2021 16:00:00 +0000

The hero image has been around for decades. However, it didn’t catch on in modern web design until only about 10 or so years ago. If you’re unfamiliar with the term, a hero image is the large banner image you see at the top of websites that takes up most, if not all, of the window when you first load the site. A heading, a very short description, and a call to action usually accompany them.

Apple is one of the best in the business when it comes to hero images.

When used correctly, a hero image is a great way to make positive first impressions that instantly builds credibility and trust for your brand. Given the popularity of hero images, it’s no surprise that many businesses and organizations that use them often feel like they could be getting more from them. Unfortunately, when you’re dealing with graphics and images, all it takes is one minuscule misstep to send your audience running for the exits.

1. Your Hero Image is not Telling Your Story

They say a picture tells a thousand words. That’s especially true with hero images. In fact, the less text that accompanies them, the better. However, keep in mind that reducing the amount of text shifts even more of the burden to your hero image. As a result, it puts even more pressure on you to ensure everything is perfect.

Your hero image should tell your story. Without even reading the text, your audience should have a pretty good idea of as many of the following as possible.

What you are selling or showcasing
Personality of your brand
Your brand’s mission and/or values

You can find some spectacularly terrible examples of web design from just a quick Google Image search. In this screenshot, can you figure out what this company does without reading any of the text?

A Terrific Example of a Terrible Hero Image

Put aside the font and color choices for a sec. A grainy image of a couple puffy clouds tells us nothing about the company! At first glance, you’d have no idea the website was about horses unless you read the text. What makes it even worse is that after reading the first two lines of text, you still have no idea what they do. It’s not until you get to the third line that they reveal that they sell horses.

So how do they make it better? First and foremost, the hero image should have an image of one of their horses. Then add a little personality. If they’re selling show horses, put a picture of one of their horses at a show. Selling to a summer camp? How about a picture of a kid on a horse actively engaging with an instructor? Anything is better than the clouds.

2. There’s No Clear Call to Action

I’ll be the first to admit, I have been guilty of this in the past. Without a call to action, your audience has reached the end of the road. And it’s often a dead end road. With no clear indication of where to go, a small fraction of your audience will poke around your navigation menu. A few more will turn around and back up. But the vast majority of visitors will simple hit the red “X” in the corner and leave. The lack of a clear call to action is the leading cause of prospects exiting your sales funnel.

A Minor Detail Makes a Huge Difference

We can actually use on of my own websites to demonstrate the effect of the lack of a call to action. The Matt Gove Photo site uses a large hero image on the home page with links to my most recent adventures. Because the site is focused on travel and outdoor adventure, the heading and subheading reference the specific adventure and the state or country in which it’s located. Underneath, you’ll find the call to action: links where you can view photos, blog posts, videos, and more. Now, how would you react if you landed on the site and those calls to action had suddenly disappeared?

Matt Gove Photo Home Page with Calls to Action Removed

When you look at that hero image, you really want to see the rest of the photos. But without a call to action, you have nowhere to begin. You’d have to go searching through the whole photo gallery to find them. I don’t know about you, but I’m far too lazy for that. I’d have a quick scroll through the home page and then probably leave.

Thankfully, that example is purely hypothetical. If you visit the site, rest assured that the calls to action are all still there.

The Actual Matt Gove Photo Home Page Contains Links to My Most Recent Adventures

Amazing how such a small detail can make such a big difference, isn’t it?

Can’t Think of a Good Call to Action? Here’s What to Do.

So what should you do if you can’t think of a good call to action? Or maybe there actually is no logical path to your next step? When in doubt, give these a try. Your goal here is to keep your audience engaged, not make a sale.

Signup form for your email list, with a free giveaway to encourage people to sign up
An exclusive offer you won’t find anywhere else on the site
List of references where they can learn more about you or the website’s topic
A link to book an appointment, meeting, or phone call with you
Links to follow you on social media

3. There are Too Many Calls to Action

On the flip side, it’s easy to get caught up and include too many calls to action. In an ideal world, the clearest call to action you can make is to only have one. Having two is okay, especially if one is a “Learn More” link. However, three is pushing it in many circumstances, unless there is a clear and logical reason for it. On the Matt Gove Photo home page, that’s the case. The photos are clearly divided into three parts.

The turquoise waters and sandy beaches on the Nevada side of Lake Tahoe
The rugged cliffs and deep waters on the California side
My hike up to the top of a cliff to take aerial photos.

Under no circumstance should you have more than three calls to action associated with your hero image. Even with three, you risk overwhelming and confusing your audience with too many choices. Unfortunately, when you have too many choices, the one you make most often is simply to leave.

Let’s back up in time for a sec. We’ll go back to 2013. At the time, I had little experience when it came to web design and web development. Not surprisingly, I tried to cram way too much into the home page. To say it overwhelmed you with choices is an understatement. Not to mention I needed a few lessons in color theory.

Matt Gove Photo Home Page in 2013

Back in the present day, I cringe big time looking at that. You should too. But we must learn from our mistakes and experiences. My how things have changed since then.

4. Too Much Text or Copy Muffles the Effectiveness of Your Hero Image

As we discussed at the beginning, your hero image should do most of the heavy lifting for getting your message across. Keep your text to a bare minimum. It should consist of no more than:

Main Heading
Subheading
Short desciption – 1 or 2 sentences
Call(s) to Action

There is once exception to this rule when it’s okay to write more than a couple short sentences: coming soon ore pre-launch pages. The reason why? You need to be able to describe both what is coming soon as well as the benefits your audience will get once it launches. If you can do it only one sentence, more power to you. But for most of us, it takes a short paragraph.

5. The Contrast of Your Hero Image is High, so You Can’t Read the Overlaid Text

If you’ve ever tried to overlay text over any photo of the outdoors, you’ve likely run into this issue. No matter which color you choose for the text, there’s part of the image where you can’t read it. Your first thought may be to make the text multiple colors, but that never looks professional.

Let me let you in on an industry secret. Okay, it’s not really a secret, as just looking at the “Videos Coming Soon” page in the screenshot above gives it away. If you put a semi-transparent overlay on top of the image, the overlay mutes the effect of the high contrast and lets you easily read the text without having to change colors or squint at it from a weird angle. You want to find the perfect balance where the text is easy to read, but you can still clearly see what the image is behind the overlay.

For comparison, here’s the same “Videos Coming Soon” page with the semi-transparent overlay removed. Quite a difference, isn’t it?

6. Your Hero Image is too Small

There are two ways you can go with regards to size. First, the resolution of your photo may be smaller than the resolution of your screen. For making a professional, trustworthy first impression, it’s a complete disaster if that happens. Not only does such a mistake make you look like an amateur, it also looks like you just don’t care.

If Your Hero Image is too Small, You’ll Be First in Line at Amateur Hour

Now, I intentionally shrunk the image in a development environment to generate that screenshot. However, if you are dealing with very high-resolution screens (larger than 4K), you may run into an issue where your large hero image starts to adversely affect your page performance. There are a couple ways around it, both of which I employ on the Matt Gove Photo home page.

Use JavaScript to asynchronously load the image at the same time the rest of the page loads. In other words, you load the page without the image and the image simultaneously.
Use the background-size: cover CSS property to ensure that both dimensions of the hero image remain greater than or equal to the dimensions of the screen. Be aware that this can make your hero image grainy if its resolution is not optimized for larger screens.

Second, your hero image may not be taking up enough real estate on the page. In that case, you can easily argue that it’s no longer a hero image, but that’s a discussion for another day. Your hero image should take up the entire screen regardless of its size, orientation, and resolution. If your viewer’s eye is not immediately drawn to it, you’re doing it wrong.

You Can Shrink Your Hero Image to Tease What’s Below the Fold

There is one scenario where it’s perfectly okay to shrink your hero image: to tease what’s below the fold (the content you have to scroll down to see). If you have a look the next time you buy something online, you’ll find many businesses and e-commerce sites apply this strategy. I use it on my business’ website, too.

Don’t be Afraid to Tease Your Viewers with “Below the Fold” Content

7. Your Hero Image is a Low Quality, Under or Overexposed Photo

Your hero image should be one you would frame to hang up in your home or office. People should be oohing and aahing over it. Don’t use crappy images. Ever. If your photography skills aren’t up to snuff, you should either license a photo or hire a professional photographer to take and/or process your photos for you.

8. Your Unoptimized Hero Image is Suffocating Your Website’s Load Time

When your audience logs onto your website or application, they expect it to load. Fast. If your site takes more than a few seconds to load, you can kiss your audience goodbye. They won’t wait around for it to load. And losing your audience isn’t your only worry. Search engines will punish your website if they detect it’s unnecessarily slow.

Unfortunately, we’re barreling right towards a Catch 22. Large, and often bloated, images are the #1 cause of slow websites. So how do you maintain that lightning fast load time while at the same time being able to use beautiful, high-quality hero images?

As we discussed a few sections ago, use JavaScript to asynchronously load the image. That way, the rest of the page loads without the extra weight of the image. At the same time, JavaScript loads the image and inserts it into the page at the same time the page is loading.
Save your hero image in jpeg format. Other formats, such as png, are much larger and are not optimized to be used as hero images.
Your hero image should not be larger than 4K resolution (3840 x 2160). This recommendation will likely change in the future, but screens larger than 4K are so rare these days, it’s not worth it yet. Consult your analytics to find out what screen sizes your audience uses. Then optimize your hero image for those screen sizes.
Use a separate hero image that’s optimized for mobile devices to speed up mobile users’ performance significantly.

9. You’re Using Multiple Hero Images

Hero images were designed to be used one at a time, and one per page. If one hero image is bogging down your website, imagine what several will do. In addition to the performance issues, you risk overwhelming your audience with choices if you use multiple hero images on the same page. And we all know what happens when you do that.

In addition, sliders and carousels stopped being popular in 2010. Don’t use them. Search engines have a brutally difficult time crawling them, which can have a profoundly negative effect on your search engine optimization. Most SEO and conversion experts agree that they have little use 99% of the time. In addition to bogging down your site, the statistics just don’t justify their use anymore.

Slide	Amount of Clicks
First Slide	90%
Second Slide	8.8%
Third Slide and Above	1.7 to 2.3%

Source: Notre Dame University Web Development

10. A Hero Image May Not Be Right For You

If you feel you’ve done everything right with your hero image and still aren’t getting conversions, it may mean that hero images aren’t for you. There’s nothing wrong with that. Maybe you have home page content that is constantly being updated. Have a look at any news site out there. None of them use hero images. The same goes for certain e-commerce businesses. Amazon, Walmart, Home Depot, and Best Buy don’t use them, either.

If you don’t feel hero images right for you, don’t use them. Yes, they’re all the hype right now. And yes, they can be absolutely gorgeous. But they’re not for everyone. You’re the only one who can make the decision as to what’s best for you.

Conclusion

Well, that’s about enough butchering of my own websites as I can take. When used correctly, hero images can convert at an incredible rate and boost your credibility to levels you didn’t think possible. Unfortunately, that’s an incredibly difficult needle to thread. Hero images are astonishingly easy to screw up. I’ve been building websites since 2008 and I still find new ways to make mistakes.

However, we must continue to learn from our mistakes. Use analytics to your advantage. They’ll tell you why your hero image is not converting. Please reach out to us if you need any help. With our expertise in data science, web development, and graphic design, we’ll help you process your analytics and make sure that your hero image becomes a magnet for leads. The worst thing you can do is let it frustrate you.

Top Photo: A Desolate Road to Nowhere
Death Valley National Park, California – February, 2020

The post Does Your Website Make These 10 Mistakes with Hero Images? appeared first on Matthew Gove Blog.

COVID-19 Dashboard Upgrades: 3 Phases That Will Help You Make Better Decisions

Matt Gove — Mon, 21 Dec 2020 22:25:09 +0000

I am so excited to announce three phases of major upgrades to our COVID-19 dashboard and map. We are adding tons of new data and features that will help you better make the decisions you need to stay safe in your day-to-day life. The upgrades are being rolled out in three phases over the next several weeks. Some of the highlights include:

Expand state/province support from 3 countries to 17
Add support to plot and map territories, such as Puerto Rico, the Virgin Islands, and Martinique.
Add county-level data for the United States’ states and territories
Expand the map’s data fields from 6 to 14

Phase 1: Significantly Expand the Datasets on the COVID-19 Dashboard Map

Ever since the map launched back in April, I was not fully satisfied with the limitations on the available data as well as the lack of functionality on the map for the COVID-19 data by state and province. Once it became clear that the fall/winter wave in the United States was going to be really bad, it was time to upgrade the dashboard. The last thing I want is for people to get sick because of the lack of data and functionality on the COVID-19 dashboard and map.

The first order of business is to greatly expand the available data coverage within each country on the map. In addition to the map already displaying data by state/province for the United States and Canada, we have added support for Australia and Mexico as well. We will add even more countries in Phase 3 to bring the total to seventeen.

A Not-So Sneak Preview of the New Data Catalogue

Next, we expanded the data fields that can be plotted on the map. We started with the old set of parameters.

Date
Total Cases
Total Deaths
New Daily Cases
New Daily Deaths
14-Day Change in New Cases (Percent)
14-Day Change in New Deaths (Percent)

We added the following parameters.

All original features per 1 million population
Number of Active Cases
Number of Active Cases per 1 million Population
The approximate probability any one random person you run into in public is infected with COVID-19
Matt’s Risk Index

Matt’s Risk Index

With the pandemic raging so badly out of control in the United States, I wanted an easy way to assess general risk for normal day-to-day activities in public. These activities could include running errands, exercising, going to restaurants, and much more. Please be aware that I still consider the index to be in a “Beta” phase, and it will likely receive minor tweaks over the next few weeks.

The index is a weighted average comprised of the number of active cases, the odds any one person you cross paths with is infected, the daily new cases, and the 14-day trend in cases. We account for population by evaluating these parameters per capita. As a result, the index can be evaluated at the country, state/province, or county level. You can easily compare countries to provinces, states to counties, and more.

Map of Matt’s Risk Index for States and Provinces in Canada, the United States, and Mexico

Index Value	Description
Less than 5	Virus is Under Control
5 to 10	Low or Marginal Risk
11 to 20	Enhanced Risk
21 to 30	Medium or Moderate Risk
31 to 40	Elevated Risk
41 to 50	High Risk
51 to 60	Critical Risk
61 to 75	Extreme Risk
Greater than 75	Catastrophic Risk

Any value greater than 40 is considered a Particularly Dangerous Situation, and your interactions with the public should be kept to a minimum.

New Map Features

Adding the time series charts to all datasets on the map and streamlining the process of selecting which parameter to display have been a top priority since April. The COVID-19 dashboard map now includes these features for all datasets.

The COVID-19 Dashboard Map now includes a time series plot and a streamlined user interface for all datasets.

Additional Datasets

In addition to being able to plot data by country or state/province, we also wanted to greatly expand the available datasets. The data catalog now includes the following. Unless otherwise noted, there is currently support for Australia, Canada, Mexico, and the United States. We will be implementing support for additional countries in Phase 3.

COVID-19 Data for All Territories
- Includes the four countries listed above, plus Denmark, France, the Netherlands, and the United Kingdom.
COVID-19 Data for Counties (US only)
Model Outputs from the most recent model run (US and Canada only)
State or Province Mask Mandates (US, Canada, and Mexico only)

Phase 1 launched on Friday, 18 December, 2020.

Phase 2: Adding New Features to the COVID-19 Dashboard and Model

While many of the new features are on the map, the dashboard is getting a few updates as well. They may not be as significant as the map updates, but they’ll make an impact.

Chart Updates

The database rebuild allowed us to expand the countries that have data broken down by state from 3 to 17. The Plot by State tab will be updated to reflect those changes. In addition to states, you’ll also be able to plot territories’ COVID-19 data.

Additionally, the x-axis on the charts will default to showing the calendar date instead of the number of days since the 100th confirmed case. Like the other settings, there will be a menu to select which parameter you want to display on the x-axis.

Model Updates

Our COVID-19 model has gotten several minor updates over the past few weeks. A lot of things have changed since the spring, so the model has been updated to better reflect them. In addition, model output ranges have been refined to much more accurately and realistically show projected outcomes.

Since we added many more states to the database, I will be including additional states in each model run as well. I may include territories at a future date, but I currently have no plans to because case loads in the territories are so low. The states and provinces added to the model in this update include:

United States

District of Columbia (I know, it’s not a state, but I digress)

Canada

New Brunswick
Newfoundland and Labrador
Nova Scotia

Mexico

Baja California
Sonora
Chihuahua
Coahuila
Nuevo León
Tamaulipas
Ciudad de México/Distrito Federal (Mexico City/Federal District)
Estado de México

Additional states and provinces will be added to the model runs as case counts dictate.

I hope to launch Phase 2 by 31 December, 2020.

Phase 3: Add Support for Additional Countries’ States and Provinces to the COVID-19 Dashboard Map

Since we’ve expanded our dataset to now include state and provincial data for 17 countries, it would be foolish not to be able to plot those data on the map. The full list of countries spans 5 continents and by the end of Phase 3 will include the following.

North America

Canada
Mexico
United States

South America

Brazil
Chile
Colombia
Peru

Oceania

Australia

Europe

Germany
The Netherlands
Russia
Spain
Sweden
Ukraine

Asia

China
India
Japan
Pakistan

Phase 3 will launch in early January, 2021.

Conclusion

As the COVID-19 pandemic continues to rage across the globe, access to complete, easy-to-interpret data and maps is critical to winning the fight against it. I hope these new updates to our COVID-19 dashboard go a long way towards accomplishing that goal. If there’s anything you feel is missing from the dashboard or map, please let me know in the comments below, and I will address them as soon as possible.

Links: Visit the COVID-19 Dashboard or the COVID-19 Map

Top Photo: The New Matt’s Risk Index Evaluated on a Map for All US Counties on 19 December, 2020

The post COVID-19 Dashboard Upgrades: 3 Phases That Will Help You Make Better Decisions appeared first on Matthew Gove Blog.

A 15-Minute Intro to Supercharging Your GIS Productivity with Python

Matt Gove — Sun, 13 Dec 2020 22:33:41 +0000

Geographic Information Systems (GIS) and mapping applications are one of the most underrated uses of the Python programming language. Python integrates easily with many desktop and web-based GIS programs, and can even create simple maps on its own. You can supercharge your GIS productivity in as little as 10 minutes with simple Python scripts. Let’s get started.

1. Copy large amounts of data between geographic files and databases

It’s no secret that the future of big data is here. As your datasets get larger, it’s much more efficient to keep the data in a databases instead of embedded in your geography files. Instead of manually copying the data between your GIS files and the database, why not let Python do the heavy lifting for you?

I am in the process of expanding my COVID-19 dashboard to be able to plot data by state for many more countries than just Australia, Canada, and the United States.

I am also expanding the US dataset to break it down as far as the county level. In order to do so, I had to add all 3,200-plus counties to my geodatabase. Manually entering datasets of this scale in the past have taken months. With Python, I completed the data entry of all 3,200-plus counties in less than 20 minutes. Using QGIS, the Python script completed the following steps.

Open the shapefile containing data about each county.
In each row of the shapefile’s attribute table, identify the county name, the state in which it’s located, and the population.
Assign a unique identifier to each county. I opted to use my own unique identifier instead of the FIPS codes in case I wanted to add counties in additional countries. FIPS codes are only valid in the US.
Write the parsed data to a CSV file. That CSV file was imported into a new table in the geodatabase.

#!/usr/bin/env python3
from qgis.core import QgsProject
import csv

# Step 1: Open the shapefile
qgis_instance = QgsProject.instance()
shp_counties = qgis_instance.mapLayersByName("US Counties")[0]
county_features = shp_counties.getFeatures()

# Initialize a list to store data that will be written to CSV
csv_output = [
    ["CountyID", "County Name", "State Name", "Population"
]

# Initialize our unique ID that is used in Step 3
county_id = 1

# Step 2: In each row, identify the county, state, and population
for county in county_features:
    fips_code = county.attribute(0)
    county_name = county.attribute(1)
    state_name = county.attribute(2)
    population = county.attribute(3)

    # Define output CSV row
    csv_row = [county_id, county_name, state_name, population]

    # Add the row to the output CSV data
    csv_output.append(csv_row)

    # Increment the county_id unique identifier
    county_id += 1

# Step 4: Write the data to CSV
with open("county-data.csv", "w") as ofile:
    writer = csv.writer(ofile)
    writer.writerows(csv_output)

2. Generate publication-ready graphs with Python from within your GIS program

Have you ever been deep in a geospatial analysis and discovered that you needed to create a graph of the data. Whether you need to plot a time series, a bar chart, or any other kind of graph, Python comes to the rescue again.

Python’s matplotlib library is the gold standard for data analysis and figure creation. With matplotlib, you can generate publication-quality figures right from the Python console in your GIS program. Pretty slick, huh?

My COVID-19 model uses matplotlib to generate time series plots for each geographic entity being modeled. While I run the model out of a Jupyter Notebook, you could easily generate these plots from within a GIS program.

Sample matplotlib time series chart that my COVID-19 model generates

The model is over 2,000 lines of Python code, so if you want see it, please download it here.

3. Supercharge Your GIS Statistical Analysis with Python

In addition to matplotlib, the Python pandas library is another great library for both data science and GIS. Both libraries come with a broad and powerful toolbox for performing a statistical analysis on your dataset.

Since we’re still in the middle of the raging COVID-19 pandemic, let’s do a basic statistical analysis on some recent COVID-19 data. We’ll do a basic statstical analysis on some confirmed cases and deaths data in the United States. Let’s have a look at new daily cases by state.

New Daily COVID-19 Cases (7-Day Moving Average) in the United States on 10 December, 2020

As a simple statistical analysis, let’s identify the following values and the corresponding states.

Most New Daily Cases
Fewest New Daily Cases
Mean New Daily Cases for all 50 states
Median New Daily Cases for all 50 states
How many states are seeing more than 10,000, 5,000, and 3,000 new daily cases.

Even though I store these data in a database, and these values can easily be extracted with database queries, let’s assume that the data are embedded in a shapefile or CSV file that looks like this.

State	New Daily COVID-19 Cases
Pennsylvania	10,247
Arkansas	2,070
Wyoming	435
Georgia	5,320
California	29,415

Sample of New Daily COVID-19 Case Data for the 50 US States

Is There a Strategy to Tackle the Statistical Analysis with Python?

First, regardless of the order that the data in the table are in, we want to break the columns down into 2 parallel arrays. There are more advanced ways to sort the data, but those are beyond the scope of this tutorial.

import csv

# Initialize parallel arrays
states = []
new_cases = []

# Read in data from the CSV file
with open("covid19-new-cases-20201210.csv", "r") as covid_data:
    reader = csv.reader(covid_data)
    for row in reader:
        # Extract State Name and Case Count
        state = row[0]
        new_case_count = row[1]

        # Add state name and new cases to parallel arrays.
        states.append(state)
        new_cases.append(new_case_count)

Now, let’s jump into the statistical analysis. The primary advantage of using parallel arrays is we can use the location of the case count data in the array to identify which state it comes from. As long as the state name and its corresponding case count are in the same location within the parallel arrays, it does not matter what order the CSV file presents the data.

Identify States with Fewest and Most New Daily Cases

First up in our statistical analysis is to identify the states with the most and fewest new daily COVID-19 cases. We’ll take advantage of Python’s build-in statistical functions. Variable names are consistent with the above block of code.

most_cases = max(new_cases)
fewest_cases = min(new_cases)

When you run this, you’ll find that most_cases = 29,415 and fewest_cases = 101. Now we need to determine which states those values correspond to. This is where the parallel arrays come in. Python’s index() method tells us where in the new_case_count array the value is located. We’ll then reference the same location in the states array, which will give us the state name.

most_cases_index = new_cases.index(most_cases)
most_cases_state = states[most_cases_index]

fewest_cases_index = new_cases.index(fewest_cases)
fewest_cases_state = states[fewest_cases_index]

When you run this block of code, you’ll discover that California has the most new daily cases, while Hawaii has the fewest.

Mean New Daily Cases

While Python does not have a built-in averaging or mean function, it’s an easy calculation to make. Simply add up the values you want to average and divide them by the number of values. Because the data is stored in arrays, we can simple sum the array and divide it by the length of the array. In this instance, the length of the array is 51: the 50 states, plus the District of Columbia. It’s a very tidy one line of code.

mean_new_daily_cases = sum(new_cases) / len(new_cases)

Rounding to the nearest whole number, the United States experienced an average of 4,390 new COVID-19 cases per state on 10 December.

Median New Daily Cases

Python does not have a built-in function to generate the median of a list of numbers, but thankfully, like the mean, it’s easy to calculate. First, we need to sort the array of new daily cases from smallest to largest. We’ll save this into a new array because we need to preserve the order of the original array and maintain the integrity of the parallel arrays.

sorted_new_cases = sorted(new_cases)

Once the values are sorted, the median is simply the value of the middle-most point in the array. Because we included the District of Columbia, there are 51 values in our array, so we can select the middle value (the 25th item in the array). If we only used the 50 states, we would need to average the two middle-most values. In Python, it looks like this. The double slash means that you round the division down to the nearest whole number.

middle_index = len(sorted_new_cases) // 2
median_new_cases = sorted_new_cases[middle_index]

The median value returned is 2,431 new cases. Now we need to figure out which state that value belongs to. To that, just do the same thing we did when calculating the max and min values. Look in the original new_cases array for the value and look in that same location in the states array.

median_cases_index = new_cases.index(median_new_cases)
median_state = states[median_cases_index]

On 10 December, the state with the median new daily cases was Connecticut.

Determine How Many States are Seeing More than 10,000, 5,000, and 3,000 New Daily Cases

To determine how many states are seeing more than 10K, 5K, and 3K new daily cases, we simply need to count how many values in the new_cases array are greater than those three values. Using 3,000 as an example, it can be coded as follows (the “gt” stands for “greater than”).

values_gt_3000 = []

for n in new_cases:
    if n > 3000:
        values_gt_3000.append(n)

Thankfully, Python gives us a shorthand way to code this so we don’t wind up with lots of nested layers in the code. The block of code above can be compressed into just a single line.

values_gt_3000 = [n for n in new_cases if n > 3000]

To get the number of states with more than 3,000 new daily cases, recall that the len() function tells us how many values are in an array. All you need to do is apply the len() function to the above array and you have your answer.

num_values_gt_3000 = len([n for n in new_cases if n > 3000])
num_values_gt_5000 = len([n for n in new_cases if n > 5000])
num_values_gt_10000 = len([n for n in new_cases if n > 10000])

When you run that code, here are the values for the 10 December dataset. You can determine which states these are by either looking at the map above or using the same techniques we used to extract the state in the max, min, and median calculations.

New Daily Case Cutoff	Number of States
> 10,000	4
> 5,000	13
> 3,000	24

4. Look into the Future with Mathematical Modeling

As many of you know, I built my own COVID-19 model last spring. The model is written in Python and extensively uses the matplotlib library. The model predicts the number of cases both two weeks and one month out as well as the apex date of the pandemic. While I primarily focus on the 50 US states and 6 Canadian provinces, you can run the model for any country in the world, and every state in 17 countries.

While many models, whether it be weather models, COVID-19 models, and more make heavy use of maps, there’s more to modeling than just maps. Remember above when we generated publication-quality graphs and figures right from within your GIS application using matplolib? You can do the same thing for you non-geospatial model outputs.

Sample Time-Series Output from My COVID-19 model. The plot was generated with bokeh, a Python library that creates interactive plots for use in a web browser.

5. Merge, Split, and Edit Geographical Features

Have a shapefile or other geographic file that you need to make bulk edits on? Why not let Python do the heavy lifting for you. Let’s say you have a shapefile of every county in the United States. There are over 3,200 counties in the US, and I don’t know about you, but I have little to no interest in entering that much data manually.

In the attribute table of that shapefile, you have nothing but the Federal Information Processing Standards (FIPS) code for each county. FIPS codes are standardized unique indentifiers the US Federal Government assigns to entities such as states and counties. You want to add the county names and the county’s state to the shapefile.

In addition, you also have a spreadsheet, in CSV format of all the FIPS codes, county names, and which state each county is located in. That table looks something like this.

FIPS Code	County Name	State Name
04013	Maricopa	Arizona
06037	Los Angeles	California
26163	Wayne	Michigan
12086	Miami-Dade	Florida
48201	Harris	Texas
13135	Gwinnett	Georgia

Fun Fact: The first two digits of the county FIPS code identify which state the county is located in.

To add the data from this table into the shapefile, all you need is a few lines of Python code. In this example, the code must be run in the Python console in QGIS.

#!/usr/bin/env python3
from qgis.core import QgsProject
import csv

# Define QGIS Project Instance and Shapefile Layer
instance = QgsProject.instance()
counties_shp = instance.mapLayersByName("US Counties")[0]

# Load features (county polygons) in the shapefile
county_features = counties_shp.getFeatures()

# Define the columns in the shapefile that we will be referencing.
shp_fips_column = counties_shp.dataProvider().fieldNameIndex("FIPS")
shp_state_column = counties_shp.dataProvider().fieldNameIndex("State")
shp_county_column = counties_shp.dataProvider().fieldNameIndex("County")

# Start Editing the Shapefile
counties_shp.startEditing()

# Read in state and county name data from the CSV File
with open("fips-county.csv", "r") as fips_csv:
    csv_data = list(csv.reader(fips_csv))

# Loop through the rows in the shapefile
for county in county_features:
    shp_fips = county.attribute(0)

    # The Feature ID is internal to QGIS and is used to identify which row to add the data to.
    feature_id = county.id()

    # Loop through CSV to Find Match
    for row in csv_data:
        csv_fips = row[0]
        county_name = row[1]
        state_name = row[2]

        # If CSV FIPS matches the FIPS in the Shapefile, assign state and county name to the shapefile row.
        if csv_fips == shp_fips:
            county.changeAttributeValue(feature_id, shp_county_column, county_name)
            county.changeAttributeValue(feature_id, shp_state_column, state_name)

# Commit, Save Your Changes to the Shapefile, and Exit Edit Mode
counties_shp.commitChanges()
iface.VectorLayerTools().stopEditing(counties_shp)

6. Generate Metadata, Region Mapping Files, and More

We all hate doing monotonous busy work. Who doesn’t, right. Most metadata entry falls into that category. There is a surprising amount of metadata associated with GIS datasets.

In the above section, we edited the data inside of shapefiles and geodatabases with Python. It turns out that data is not the only thing you can edit with Python. Automating metadata entry and updates with Python is one of the easiest and most efficient ways to boost your GIS productivity and focus on the tasks you need to get done.

A Brief Intro to Region Mapping Files

In today’s era of big data, the future of GIS is in vector tiles. You’ve probably heard of vector images, which are comprised of points, lines, and polygons that are based on mathematical equations instead of pixels. Vector images are smaller and much more lightweight than traditional images. In the world of web development, vector images can drastically increase a website’s speed and decrease its load times.

Vector images can also be applied to both basemaps and layer geometries, such as state or country outlines. In the context of GIS, they’re called vector tiles, and are primarily used in online maps. Vector tiles are how Google Maps, Mapbox, and many other household mapping applications load so quickly.

What is a Region Mapping File?

Region mapping files simply use a unique identifier to map a feature in a vector tile to a row in a data table. The data in the table is then displayed on the map. Let’s look at an example using my COVID-19 Dashboard’s map. While I use my own unique identifiers to do the actual mapping, in this example, we’ll use the ISO country codes to map COVID-19 data in Europe. The data table is simply a CSV file that looks like this. For time series data, there would also be a column for the timestamp.

Country Code	Total Cases	Total Deaths	New Cases	New Deaths
de	1,314,309	21,567	22,399	427
es	1,730,575	47,624	6,561	196
fr	2,405,210	57,671	11,930	402
gb	1,814,395	63,603	17,085	413
it	1,805,873	63,387	16,705	648
nl	604,452	10,051	7,345	50
pt	340,287	5,373	3,962	81

Actual COVID-19 data for certain European Union Countries on 10 December, 2020

The region mapping file is a JSON (JavaScript Object Notation) file that identifies the valid entity ID’s in each vector tiles file. In this example, the entity ID is the Country Code. We can use Python to generate that file for all country codes in the world. I keep all of the entity ID’s in a database. Databases are beyond the scope of this tutorial, but for this example, I queried the country codes from the database and saved them in an array called country_codes.

# Initialize an array to store the country codes
region_mapping_ids = []

# Add all country codes to the region mapping ID's
for country_code in country_codes:
    region_mapping_ids.append(country_code)

# Define the Region Mapping JSON
region_mapping_json = {
    "layer": "WorldCountries",
    "property": "CountryCode",
    "values": region_mapping_ids
}

# Write the Region Mapping JSON to a file.
with open("region_map-WorldCountries.json", "w") as ofile:
    json.dump(region_mapping_json, ofile)

The Region Mapping JSON

In the Region Mapping JSON, the layer property identifies which vector tiles file to use. The property property identifies which attribute in the vector tiles is the unique identifier (in this case, the country code). Finally, the values property identifies all valid unique ID’s that can be mapped using that set of vector tiles.

When you put it all together, you get a lightweight map that loads very fast, as the map layers are in vector image format, and the data is in CSV format. Region mapping region shines when you have time series data, such as COVID-19. Here is the final result.

New Daily COVID-19 Cases in Europe – 10 December, 2020

Conclusion

Python is an incredibly powerful tool to have in your GIS arsenal. It can boost your productivity, aid your analysis, and much more. Even though this tutorial barely scratches the surface of the incredible potential these two technologies have together, I hope this gives you some ideas to improve your own projects. Stay tuned for more.

Top Photo: Beautiful Sierra Nevada Geology at the Alabama Hills
Lone Pine, California – Feburary, 2020

The post A 15-Minute Intro to Supercharging Your GIS Productivity with Python appeared first on Matthew Gove Blog.

Digging Deeper: Diagnosing My Raspberry Pi Temperature Sensor’s “Hot Flashes”

Matt Gove — Wed, 05 Feb 2020 22:39:29 +0000

I recently identified a rather unfortunate bug that left part of my weather station’s QA/QC algorithm redundant. Fixing it treated the symptom, but not the cause of the problem. Today, we’re going to look at what is causing the temperature sensor on one of my Raspberry Pi’s to seemingly randomly show air temperatures close to 500°F. Can we nip the problem in the bud before it even gets to my QA/QC algorithm?

Before I begin troubleshooting an issue, I always brainstorm a list of at least 2-3 things that I think may be causing the problem. In no particular order, some of the potential causes I came up with were:

An electrical problem that is causing voltages to go haywire
A malfunctioning or defective sensor
A bug in the Python code on the Raspberry Pi that reads the raw data from the sensor and converts it into Celsius

Start with the Easiest Issue to Troubleshoot

Before diving into troubleshooting, I always see if there are any causes I can reasonably eliminate. In this case, a malfunctioning or defective sensor is highly unlikely to be causing the problem. The sensor is an MPL3115A2 I2C digital barometric, temperature, and altitude sensor. It has been in service on the Raspberry Pi since last May and has taken nearly 1.2 million measurements. Of those 1.2 million measurements, the QA/QC algorithm has flagged 590 of them. That works out to 0.04% of measurements being bad, or a 99.96% measurement accuracy rate.

Identify Possible Bugs where the Raspberry Pi Interprets the Sensor Readings

Of the remaining two possibilities, I always start with whatever is easiest to troubleshoot. In this case, it would be a possible bug in the Python code. On the Raspberry Pi, Python interprets sensor readings and logs them in the database. Intermittent electrical problems are a nightmare to diagnose in the best of situations. The first thing to do is to look at the data and try to identify any patterns. If you’ve ever watched any detective shows on TV, you know what they always say: evidence doesn’t lie.

Raw sensor data stored in the Raspberry Pi’s datalogger

One pattern should jump out at you immediately. As air temperatures go to, and drop below freezing (0°C), the sensor readings on the Raspberry Pi go haywire. It is also important to note here that this does not rule out an electrical problem causing the problem. To strengthen our case, let’s query the sensor database to see if the sensor recorded any temperatures below freezing. We want to see no instances of the sensor recording temperatures below freezing.

Results from querying sensor readings for temperatures less than 0°C

We can now say with 100% certainty that something happens when the temperature hits freezing. However, we still don’t know what. To fully rule out an electrical issue causing the problem, it’s time to get on Google and research how the sensor works.

The temperature sensor connected to the Raspberry Pi

Can We Fully Rule Out an Electrical Issue Between the Raspberry Pi and the Sensor?

According to the manufacturer, there are three pieces of information that will help us here.

The sensor reads and transmits raw data in hexadecimal format.
- In traditional decimals (such as 0.625), each element of the decimal can store values from 0 to 9. Those values are represented by the corresponding numerals.
- In hexadecimal format, each element of the decimal can store values from 0 to 15. Those values are represented by the numbers 0 to 9 and the letters A to F.
- Python uses standard mathematics to convert between hexadecimal and traditional decimal formats and vice versa.
Because of the way hexadecimal format works, the sensors cannot handle signed (i.e. negative) values in their raw format.
The sensors come with a software module that converts the raw hexadecimal data to a “human-readable” decimal temperature, in Celsius.

The Problem Lies with How Digital Sensors Handle Negative Values

To finally be able to rule out an electrical problem, let’s focus on item #2. The sensors not being able to handle signed (negative) values in their raw format. With any kind of digital sensor, if it gets a reading that’s off the scale, it loops around to the opposite end of the scale.

For example, if a sensor had a scale of 0 to 400 and got a reading of -1, it would output a value of 399. A reading of -5 would output 395. It works on the other end of the scale, too. A reading of 410 would output a value of 10. During my meteorology studies at the University of Oklahoma, we routinely observed this phenomenon while measuring wind speeds inside of tornadoes with doppler radar.

Item #2 is such a key piece of the puzzle because the problem occurs as soon as temperatures drop below 0°C. With this information, we can firmly establish the bottom end of the temperature sensor’s scale. Now all we had to do was figure out what the top of that scale was. For that, we have to look at the Python code, in which I found the following equation:

raw_temperature = ((raw_temperature_msb * 256) + (raw_temperature_lsb & 0xF0)) / 16

In this line of code, Python converts the raw temperature reading to standard decimal. Python reads in the sensor’s raw hexadecimal readings bit-by-bit before converting them to Celsius and sending the to the Raspberry Pi. Each reading from the sensor contains a maximum of 2 bits. For example, if this formula was used on a standard Base 10 number, such as 27, it would be read as:

temperature = (10 * 2) + (1 * 7) = 20 + 7 = 27

Calculating the Sensor’s Upper Bound

While that information may not seem significant on its own, it gives us enough information to calculate the upper end of the sensor’s temperature scale. Remember in hexadecimal, each bit can contain a value up to 16, so calculating the upper bound (in degrees Celsius) is simply:

Upper Bound = 16 * 16 = 256

Now that we have established that the sensor has a raw output range of 0 to 256°C, let’s look back at the query results. Keep in mind what I mentioned above about off-the-scale readings looping around to the other end of the scale.

What is the Actual Temperature of the Elevated Sensor Readings

To calculate the actual temperature, simply subtract 256 from the sensor readings in the 250’s. For example, if you look at the last entry in the above table, the actual temperature would be:

Actual Temp = 254.625 – 256 = -1.375°C

Using the above query as an example, the correct actual temperatures using the loop-around equation would be:

Implementing the fix before the sensor readings even reaches the QA/QC algorithm on the Raspberry Pi is simple. We simply need to add an if statement to the Python code that converts the raw hexadecimal sensor readings to degrees Celsius. Here is the original code:

try:
    raw_temperature = ((raw_temperature_msb * 256) + (raw_temperature_lsb & 0xF0)) / 16
    unrounded_temperature = raw_temperature / 16
except:
    raw_temperature = None
    unrounded_temperature = None

With the if statement added to fix the problem, the above block of code simply becomes:

try:
    raw_temperature = ((raw_temperature_msb * 256) + (raw_temperature_lsb & 0xF0)) / 16
    unrounded_temperature = raw_temperature / 16
    if unrounded_temperature > 70:
        unrounded_temperature -= 256
except:
    raw_temperature = None
    unrounded_temperature = None

Maintain Data Integrity

The final piece of the puzzle is to update the bad data points in the database to ensure data integrity. Amazingly, we can do that with a simple UPDATE command so we don’t lose any data. Full disclosure, this is not the actual database structure in the weather station. It’s just worded that way to make it easier to understand.

UPDATE `sensor_measurement`
SET `measurement_value` = `measurement_value` - 256
WHERE `measurement_type` = 'temperature'
AND `measurement_value` >= 70;

Well, this has certainly been an interesting one. It’s always a relief to confirm that the sensor is functioning exactly as it should and the problem is nothing more than an easy-to-fix bug in the Python code on the Raspberry Pi. Until next time.

The post Digging Deeper: Diagnosing My Raspberry Pi Temperature Sensor’s “Hot Flashes” appeared first on Matthew Gove Blog.

Houston, I Think There’s a Bug in My Weather Station’s QA/QC Algorithm

Matt Gove — Mon, 03 Feb 2020 22:59:35 +0000

Have you ever heard the expression “Measure Twice, Cut Once”? It’s commonly used in woodworking and carpentry. It’s a reminder to always double and triple-check your measurements when you cut a piece of wood. You’d be amazed at how many times you screw it up. Well, let me tell you about how I completely and utterly failed to do something very similar. Stupidity struck when I programmed the algorithm that QA/QC’s the raw sensor data in my Raspberry Pi weather station. I’ll also explain how I fixed the problem.

I recently logged onto my weather station to check some high and low temperatures for the month of January. While I was casually scrolling through the data, I caught something out of the corner of my eye. When I looked a little closer, I had to do a double take.

What was even more impressive were the “heat indices”. They make summers on the Persian Gulf, where heat indices can reach ridiculous levels, look like absolute zero. For comparison, the temperature of the surface of the sun is 9,941°F.

Hunting Down Bad Data Points on the Raspberry Pi

Both my and several friends’ initial reaction was to cue the jokes: “Well, it does get hot in Arizona…”, but unfortunately on a logic and reasoning test, science will beat humor every single time. Time to figure out why this happened. First, we need to run a query for all raw temperature data my sensors recorded that were greater than 60°C (140°F). I chose that cutoff based on the hottest temperature ever recorded in Arizona. On June 29, 1994, Lake Havasu City topped out at a sizzling 53°C (128°F).

Amazingly, the query returned almost 300 hits. Here is a small sample of them.

Sensors getting screwy readings like this is part of the deal when you operate any kind of data logger like this. I am much more concerned that so many bad data points managed to slip through my QA/QC algorithm on the Raspberry Pi. I’ll admit the QA/QC algorithm was the very basic one I wrote to just “get things going”. It was long overdue for an upgrade, but still, it should have caught these.

Once I queried the dates where these bad data points occurred, the culprit was revealed.

Recently Replacing the Sensor Adds a New Wrinkle

You may recall that this past December, I had to replace the analog temperature and humidity sensor that broke. I formally decommissioned the broken sensor on December 14, 2019. Did you happen to notice the first date of bad data? That’s not a coincidence.

So what caused the QA/QC algorithm to nod off and miss these bad data points? The answer goes back to the broken analog sensor. The broken sensor measured both temperature and humidity. When the relative humidity reading hiccuped, often showing values greater than 3,000% when it did, the corresponding temperature reading would get thrown off by about 10-20°C.

The problem is that 99% of those bad temperature readings were between -5 and 15°C (23 to 59°F). During the winter months, we see actual temperatures inside that range every day here in the Phoenix area, so you can’t simply filter them out. I wrote the original QA/QC algorithm to flag relative humidity values that were greater than 100% or less then 0%. I would deal with the temperature parameter when I updated the algorithm.

The New Temperature Sensor

The new digital sensor I installed only measures altitude, barometric pressure, and temperature. As a result, the Raspberry Pi reverted to obtaining its humidity data from the National Weather Service. The NWS data is already QA/QC’d. Because my original QA/QC algorithm only flagged humidity and not temperature, it deemed every data point that passed through it “OK”, thus rendering the algorithm redundant.

To confirm that this is the issue, the database puts the data that the QA/QC algorithm flags into a separate table in the sensor database. I use that data for troubleshooting and making improvements to the algorithm. A simple query will reveal the dates of temperatures I have flagged. If swapping the sensors did in fact make the QA/QC algorithm redundant, the query will only return dates on or after the sensor replacement. I replaced the sensor on December 14, 2019.

Query results showing the most recent dates that the QA/QC algorithm flagged bad data points.

Thankfully, fixing the issue requires nothing more than a few lines of code to add an if statement to the algorithm so it flags temperatures that are outside of an acceptable range of -20 to 60°C. (-4 to 140°F). I chose the upper limit based on the hottest temperature ever recorded in Arizona (53°C/128°F). At the other end of the spectrum, I based the lower bound off of the coldest temperature ever recorded in Phoenix (-9°C/16°F). I will tweak that range as needed.

Looking Ahead

My goal is to continually add small upgrades and fixes to the QA/QC algorithm over the next year. By the time I have the complete network of sensors up and running, it will be up to a level of complexity that is acceptable for a hobby weather station. At the same time, I want it to be held as close to professional standards as I can. Stay tuned for future posts where we will look closer at what happens in the data logger’s electrical system to cause such wacky temperature readings.

The post Houston, I Think There’s a Bug in My Weather Station’s QA/QC Algorithm appeared first on Matthew Gove Blog.

Troubleshooting a Raspberry Pi Sensor Gone Awry

Matt Gove — Tue, 17 Dec 2019 23:55:00 +0000

A couple years ago, I built a weather station using a network of Raspberry Pi’s. It obtained data from Weather Underground and the National Weather Service. Last spring, I finally added temperature, humidity, and barometric pressure sensors to the weather station. I have plans to add rain and wind sensors to the Raspberry Pi in the future.

Solar radiation shield housing the temperature and humidity sensor for my weather station – May, 2019

Over the summer, I noticed something odd after a monsoon thunderstorm had gone through. The outdoor relative humidity on my weather station read about 20% higher than the outdoor humidity on my thermostat. I wrote it off as an isolated incident related to the monsoon storm and didn’t think much of it.

As the summer progressed, I noticed the same thing would happen whenever a monsoon thunderstorm would pass over us. The weather station would report the relative humidity about 20% higher than it actually was. By the end of the summer, it was reading 20% high all the time, even in sunny weather.

A Possible Hack that Would End in Disaster

On several occasions, the idea popped into my head to just go into the weather station’s Python code and subtract 20% off of the relative humidity values the sensor was reading. Trust me, it was very tempting. I had to keep telling myself that doing so would open up a whole new set of problems.

What happens if the sensor starts working properly and gives a reading below 20%. I wrote the Python code. I know that negative relative humidity values will crash the system. When you calculate the dewpoint, you would have to take the natural log of a negative number, which is undefined.

It turned out that my instinct to not tinker with the source code was correct. Over the course of the fall, that 20% gap between the sensor’s readings and the actual humidity grew to 30%, then 40% and 50%. By early November, the weather station would be reporting 80-90% relative humidity when actual humidity values were in the 10-15% range. That would have made a big, ugly mess of the source code, and would not get me any closer to fixing the problem.

The Sensor Finally Kicks the Bucket

By the 1st of December, the sensor would only give humidity readings of 100%, regardless of what the actual humidity was. Pulling the raw sensor data from the main database on the Raspberry Pi confirmed this. While I pondered my next move, I changed the weather station’s Python code. It would now get relative humidity data from the National Weather Service instead of the sensor.

Raw data dump from the weather station’s primary database: November 15 – December 1, 2019. Notice all of the 100’s in the Relative Humidity columns?

A Deeper Analysis of the Raspberry Pi Humidity Sensor Data

Looking at the above table piqued my interest enough to dump all of the relative humidity data out of the database that contains all of the sensor data and take a look at it. The sensor went into service in early May and takes readings once per minute, so the query returned about 300,000 data points. After crashing Microsoft Excel’s plotting tool several times (I have an old, cranky PC), I adjusted the query to only return the relative humidity values at the top of every hour.

Raw relative humidity data from the faulty sensor, through December 5, 2019

Now, before I show you the marked up plot explaining everything, consider the following. Think of how it can help show when the humidity sensor went off the rails.

The summer monsoon this year was almost non-existent, with only a small handful of storms the entire monsoon season
Relative humidities on non-rainy summer days generally range from a minimum of 10-20% to a maximum of 35-45%.
The fall was very dry as well, until a series of winter storms started impacting the desert southwest on Thanksgiving Day (November 28th).
I also queried the weather station’s main database for days where measurable precipitation (1 mm or more) fell. This is the result:

When you put everything together, it should become clear where the sensor goes awry.

How Do We Fix the Sensor?

Unfortunately, the only remedy at this point is to replace the sensor, preferably one of much higher quality. The old sensor that broke was a cheap analog one I paid about $5 for. You get what you pay for, right? Thankfully, the pressure sensor, which is a digital I2C sensor, also measures temperature. With some minor rewiring, I can simply swap them out.

Replacement digital I2C pressure/temperature sensor being installed in the solar radiation shield

The nice thing about I2C sensors is that you can connect multiple sensors to the system using only one data cable, so adding additional sensors is easy at either end of the wire. I can add them either in the solar radiation shield or in the waterproof housing that houses the Raspberry Pi and the power supply. Additional sensors will get connected where the four gray wire nuts are in the above picture. I will definitely be adding an I2C humidity sensor to the Raspberry Pi and likely more as well. Stay tuned.

The post Troubleshooting a Raspberry Pi Sensor Gone Awry appeared first on Matthew Gove Blog.

DIY Weather Station, Part 4: Database Configuration and Programming the Sensor Readings

Matt Gove — Mon, 22 Jul 2019 01:44:00 +0000

Wow, it’s hard to believe that the hardware for taking sensor readings for the DIY weather station is finally all in place. Now, it’s time to dive into the programming and the software end of things that makes everything click. The software for the sensor and the data logger consists of two primary parts:

A database to store the data
A script that takes sensor readings, QA/QC’s those readings, and logs them in the database.

From Sensor To Database

The flow of data from the sensors to the weather station’s primary database, is as follows. The primary database contains the full set of weather data. It logs data every 5 minutes. The weather station’s web interface displays those data. Data for which I do not have sensors are obtained from the National Weather Service.

A Python script reads data from the sensors and logs them in the data logger every minute
The Python script QA/QC’s the raw data read from the sensor.
The script puts salid or “good” data points into the database, in default metric units. It flags invalid or “bad” data points and puts them into a separate table so the primary database cannot get any “bad” data.
Every 5 minutes, a Python script associated with the primary database queries the database containing the sensor data. It takes the most recent set of “good” data points that fall within that 5 minute period and inserts that data into the primary database, along with the data it obtains from the National Weather Service. If no “good” data points fall within that 5 minute period or if the sensor or the data logger are offline, it obtains the data from the National Weather Service.
Once per day, data measured by the sensors are synced with the primary database to ensure that the primary database has the correct sensor data in the event of a network outage.

Database Design

I could easily write an entire post about database design, so I’m not going to go into too much detail of why I made the decisions to design the database the way I did. Instead, here is my list of requirements for the database:

Uses all the standard practices of relational databases
- Only store raw, non-calculable data from the sensors.
  - Data that can be calculated from the sensor output do not need to be in the database.
  - This database is separate from the primary database for the DIY weather station, which I described above
- Properly normalized to third normal form
- Easily scalable up and down
Store information about the sensors (manufacturer, model, serial number, etc)
Have a mechanism to handle bad data points flagged by a QA/QC system
Be easily queryable to pull data into the primary database for the DIY weather station (more on this later in this post)

From that list of requirements, this is the EER diagram I came up with for the sensor database.

Note here that the “good” data are put into the “measurement” table, while the “bad” data flagged by the QA/QC mechanism are put into the “measurement_to_qaqc” table.

The Python Script

The Python script is where all the magic happens. Its main purpose is to QA/QC the data it reads from the sensors, essentially acting as a guard so bad data points do not get into the weather station’s primary database.

The first iteration of the QA/QC algorithm is quite simple, and is to a degree, a bit crude. Essentially all it does is to just ensure that the data are within acceptable ranges. For example, the script flags and removes relative humidity readings of 3,285%, as well as the temperature readings of -145°C.

I haven’t begun coding a version 2 of the QA/QC algorithm, but it will look at trends in the data and eliminate unrealistic spikes. For example, if the sensors show that the temperature rose 20°C over the course of a couple minutes, that would be flagged and removed by the QA/QC algorithm. In the current first version, you could in theory have temperature readings of 10°C and 30°C within a minute of each other and the algorithm would not flag it, because both temperature values are within the acceptable or realistic temperature range.

Python Classes to Parse Sensor Readings

Within the Python code, there are classes for each type (make/model) of sensor. To instantiate an object of one of these classes, they are passed the GPIO pin in the data logger to which they are connected, as well as the Sensor ID, which is the primary key in the “sensor” table in the database described above.

The sensor classes also perform the following:

Read the raw data from the sensors and QA/QC those readings.
Round all data readings to a maximum of 5 decimal points, which the database requires
Inserts the data read from the sensors into the database.
Convert raw pressure to mean sea level pressure
Run a test measurement. The user manually calls the test method, which takes a sensor reading and dumps the data to a Terminal window without interacting with the database.

The primary script in the data logger Python module performs the following steps to log the sensor data:

Instantiates an instance of the appropriate sensor class for each sensor in the weather station network
Loops through each sensor
Reads the data from the sensor
QA/QC’s the data from the sensor
Inserts the “good” data into the database

Well that just about wraps up the sensor and data logger portion of the DIY weather station project. I certainly had a blast designing and building, and I hope you enjoyed reading about it. Until next time.

The post DIY Weather Station, Part 4: Database Configuration and Programming the Sensor Readings appeared first on Matthew Gove Blog.

How to Set Up a Website and Database to Support Multiple Languages

Matt Gove — Tue, 24 Jul 2018 22:26:02 +0000

In today’s global economy, as more and more business gets done across international borders, many organizations are finding the need to support multiple languages on their websites. And here, I’m not talking about programming languages, I’m talking about languages such as English, Spanish, and Chinese.

Today, I’m going to be giving you a quick tutorial on how to support multiple languages on your website using a relational database, regardless of whether you’re supporting 2 languages or 200, in a manner that easily scales up and down. There are other ways besides databases to support multiple languages, but databases are the easiest way to keep everything organized.

My Portfolio Website is Bilingual

I don’t really advertise it, but my professional portfolio website is actually bilingual and is available in both English and French. To change languages, click on the link at the very bottom of the footer that says “Français” or “English”. I’m hoping to add Spanish as well once my Spanish language skills improve to that level. I think using online translators is poor form for a professional website, as they often struggle with grammar and verb conjugations, and your business really shouldn’t be offering services in a language without having an employee who can speak the language fluently.

The Wrong Way to Set Up Your Database Schema

Setting up the proper database schema to both support multiple languages and scale easily the the key to this tutorial. If not done properly, you will get either one or the other. In its most basic form, a database table that supported only English would look something like this:

id	phrase
3	Hello World!

I know what you’re probably thinking right now. It must be simple to add support for a second language. We could simply add another column to the database table, like this:

id	phrase_en	phrase_fr
3	Hello World!	Bonjour le monde!

Unfortunately, as simple as this seems, this is not the correct way to do it. I will admit, I made this exact mistake when I first started adding support for the French language on my website. As soon as you start running queries, you realize why this doesn’t work. You either need to select both languages to keep the query simple or write a complex query that needs to determine which column to select.

Then, there’s also the issue of scalability. While this may seem like a perfectly valid solution for supporting 2 or 3 languages, what would this look like if you were supporting 200 languages? Every time you wanted to add or remove a language, you would need to add or delete a column from every column in the database. Calling that a tedious job is an understatement at best.

Use Unique ISO Language Codes to Properly Design Your Database Schema

The International Standards for Normalization (ISO) maintains 2-letter codes for all of the world’s common languages, which fall under the ISO-639 standard. Examples of these codes include “en” for English, “fr” for French, “es” for Spanish, “it” for Italian, and “ar” for Arabic. In the real world, you probably want to use your own foreign keys with a “language” table, but for this example, I’ll use the ISO language to make it easier to understand. The correct way to set up your database table for multiple languages is:

id	language	phrase
3	en	Hello World!
4	fr	Bonjour le monde!

A Properly Designed Database Scales Effortlessly

In addition to being properly normalized, this table makes queries much easier. If you want the English phrase, just add “WHERE language = ‘en'” to your query. If you want the French phrase, add “WHERE language = ‘fr'”. It also scales up and down nicely, as you just need to add and delete records (rows) to add or remove support for additional languages. This way, I can quickly add multiple languages at a time with just a single query, such as:

id	language	phrase
3	en	Hello World!
4	fr	Bonjour le monde!
5	es	¡Hola el mundo!
6	it	Ciao il mundo!

Use UTF-8 Encoding to Support Even More Languages

Finally, when creating a database table that supports multiple languages, you must encode the table as UTF-8, or else accented characters will not display correctly. To ensure everything displays properly, I usually encode the string into UTF-8 a second time with the programming language I’m using to extract the data from the database. I know that both PHP and Python have that functionality. UTF-8 encoding will also work for languages that do not use the Latin alphabet, such as Greek, Russian, and Chinese.

id	language	phrase
3	en	Hello World!
4	fr	Bonjour le monde!
5	es	¡Hola el mundo!
6	it	Ciao il mundo!
7	gr	Γειά σου Κόσμε!
8	ru	Привет мир!
9	ar	!مرحبا بالعالم
10	zh	你好，世界
11	th	สวัสดีชาวโลก

See how nicely that scales when you have more than just 2 or 3 languages. If you write your back end code correctly, you shouldn’t need to change it at all when you add or remove support for a language. Hopefully adding support for multiple languages to your website will allow you to begin expanding your business into places you never could have imagined earlier.

The post How to Set Up a Website and Database to Support Multiple Languages appeared first on Matthew Gove Blog.

Rebrandings and Fresh Starts

Matt Gove — Tue, 17 Jul 2018 03:07:50 +0000

Nearly 10 years ago, I launched my first website, Matt Gove Photo. Little did I know at the time that what was a crude series of patched together photo albums using Flickr and Google would eventually launch me into a career in web development. Following the launch of the Matthew Gove Web Development website in 2014, the two sites were kept and maintained separately and did not really interact with each other at all. Recently, I have been struggling to find a consistent online identity, as neither website captured my entire personality, which set off the motivation to go through a well-overdue rebranding process, with the goal of uniting all of the pieces of my online presence into something that could depict my entire personality and skillset, give me a bit of a fresh start, and give me a much more effective means to share my knowledge, tips, and advice with you.

This past year has been very weird in terms of my career. While I have been able to accomplish a lot of the goals I had set for myself, I also suffered some significant setbacks, too. While the setbacks I experienced were largely out of my control, this rebranding will allow me to have much better control over what happens for me, and will ultimately let me accomplish more goals, as well as leave me much better prepared in the event of future setbacks or bumps in the road.

This blog is the primary addition to my new brand. While I have had a blog for over 10 years now, it has for the most part been just an afterthought, and unfortunately, has been rather neglected for much of the past several years. However, that all changes now. My intention is for the blog to serve as the link that unites both websites, and much of the rest of my online presence. Older projects that relied solely on photos and videos will now be a much more significant part of the blog. My goal is to be able to balance the knowledge I share with you between my personal and professional adventures, including web development, GIS, data solutions, photography, and travel, but we’ll see just how well I can pull that off. I am also hoping to be able to link certain aspects of my personal and professional life together in ways I was unable to before, such as integrating GIS technologies with my photography and travel adventures.

In addition, both of my websites have been updated so they much more accurately depict both my personality, my skillsets, and my goals. Between this blog, my two websites, my new bio/about me page, and my various social media accounts, each entity will serve as a specific piece that will make up my entire online brand as a whole, and will ultimately much better represent who I am and what my passions and skillsets are, and will also allow me to much more effectively pass on my knowledge to you.

To new beginnings and fresh starts. Happy adventuring.

All Rebranding links:

Matthew Gove Web Development (Web Development/GIS Portfolio)
Matt Gove Photo (Photography and Travel Adventures)
New Blog (this site)
About Me/Bio Page

The post Rebrandings and Fresh Starts appeared first on Matthew Gove Blog.