The post Travel Bloggers: How to Stand Out with Powerful Interactive Maps appeared first on Matthew Gove Blog.
]]>Regardless of what industry you’re in, a good interactive map is easy to understand, offers an intuitive user experience, and draws the user’s focus to the data on the map. One of the best examples of an interactive map is the RadarScope application, which we covered in detail last month in 6 Powerful Weather Applications for Stunning Landscape Photography. If you’re not familiar with it, RadarScope is an application that plots weather radar data, severe weather warnings, and much more on a map.
So how exactly does RadarScope do it so well? When I look at the maps in the screenshots above, I make a few key observations about what makes it such a powerful interactive map, even without having access to its interactivity.
Thankfully, there are plenty of tools and applications available today to bring a similar mapping experience to your travel blog. Best of all, many are free and open source, so you don’t have to spend your hard-earned cash on expensive licensing fees. However, before we look at solutions, let’s have a look at the problem.
In order to fully understand the problem so many travel bloggers run into, let’s first look at the definition of interactive.
Interactive: allowing a two-way flow of information between a computer and a computer-user; responding to a user’s input.
Oxford Dictionary
Armed with that definition, have a look at what you’ll find on far too many travel blogs. From the home page, you click on a link or button to the interactive map. You see a simple map that looks something like this screenshot. Some of the countries may be shaded to indicate that the blogger has traveled there.
If you hover over a country, it’ll often show the country’s name and maybe a count of the number of blog posts, photos, or videos the travel blogger has created. However, if you click on a country, you’ll just be brought to another page with a list of post titles. In the best case scenario, you’ll also see a featured image and the first 20-40 words of the post, much like our All Posts page.
However, this still leaves me asking one question. If you have a map whose only purpose is to redirect visitors off of said map, why even have the map at all?
First and foremost, if you’re going to add a feature to your blog or website, it should serve more of a purpose than just redirecting visitors off of it. Your visitors should be able to gather all of the information they need without leaving the map. There is one exception, however. If you’re trying to display lengthy content, such as an entire blog post, don’t try to put the entire post in a pop-up window on the map. Nobody in their right mind is going to scroll through all of that.
Instead, you want to include key details and a link to the full blog post to make a fully interactive map. For a blog post, you’ll want to include at least four things.
To best demonstrate a fully interactive map, let’s have a look at the Matt Gove Photo Visual Media Map. Instead of blog posts, the map includes datasets to display all of our photos and videos on a map. Notice how it includes all five requirements for full interactivity. As a user, you can easily explore our photos and videos on a map without having to leave the map. In addition, you can click on the link in the pop-up window to view the full album or video.
As recently as 10 years ago, getting professional-quality interactive maps meant shelling out hundreds, if not thousands of dollars every year in GIS software licensing fees. Even worse, there were very few online GIS programs to choose from back then.
Thankfully, that has all changed. Today, you there are an plenty of free online mapping programs available. Many of these programs are both incredibly powerful and easy to set up. While it’s certainly not a requirement, I highly recommend investing in a developer to install, connect, and integrate the maps on your website. They’ll be able to connect your maps directly to your database. As a result, new content will automatically be added to your maps, allowing them to effortlessly scale with your business. It’ll cost more up front, but with a good developer, you’ll save money in the long run.
We’d love to help you get set up with your maps. If you want to discuss your project further or get a free quote to add maps to your travel blog, please get in touch with us today. Or if you’re still just kicking tires, please feel free to browse our catalog of GIS products and services.
If you’re just starting out or are looking for a simple solution, Google or Bing Maps are great options! Both platforms allow you to create high-quality maps and embed them on your website free of charge. And best of all, you don’t need a developer. Instead, you’ll just copy and paste a short block of code into your website or blog.
Unfortunately, the simplicity of both Google and Bing Maps leave them with some downsides. Most notably, if you have a dynamic dataset or high traffic volumes, you’ll run into issues. Neither platform is built to display large datasets. As a result, you’ll struggle to scale your application and will run into high API fees to generate the basemaps as your organization grows.
Mapbox is a direct competitor of both Google and Bing Maps, but it offers so much more. As a result, I typically recommend that my clients use Mapbox over Google or Bing. Mapbox gives you much finer control over your mapping application, allowing you to scale it both up and down as your business evolves. And best of all, it’s completely free to use, unless you have more than 50,000 map loads per month.
However, my favorite Mapbox feature is its use of vector tiles, which allows you to display huge amounts (read: gigabytes) of data that load and respond extremely fast.
Leaflet is a simple, yet powerful open source JavaScript library that displays two-dimensional maps on your website or blog. Because it’s hosted on your server, you don’t need to worry about API fees, regardless of how much or how little traffic you get. It’s lightweight, fast, powerful, and completely customizable. Furthermore, it has an extensive library of plugins and extensions if you need additional functionality. If you don’t know JavaScript, you’ll need a developer for the initial set up of your maps. You can easily connect Leaflet to nearly any type of database or data repository. As a result, Leaflet maps will easily scale with your business once it’s set up.
Thankfully, Leaflet requires very little maintenance once you get it up and running. In fact, I tell my clients that it’s often more cost-effective to pay for Leaflet maintenance as you need it instead of paying a recurring monthly maintenance fee. Yes, there are obviously exceptions to that rule. However, for the vast majority of people, Leaflet is an extremely cost-effective way to add high-quality maps to your website or blog.
Like Leaflet, Cesium is an open source JavaScript library that creates powerful 3D maps to “unleash the power of 3D data”. Their maps are engineered to maximize performance, precision, and experience. With Cesium, it often feels like there is no limit when it comes to what you can do with a map. In fact, Cesium also includes a timeline, so you could make the argument that its maps are four dimensional instead of three.
Furthermore, they’ve even created their own 3D vector tile format that lets you load enormous datasets in seconds. For example, check out Cesium’s demo that loads 3D models of nearly every building in New York City. It’s fast, fluid, and responsive. For additional demos, have a look at Cesium’s Use Cases. You’ll find examples from many different industries, applications, and regions.
You can get an incredible amount of power and functionality out of Cesium’s free base functionality. For the average travel blog, the free functionality is probably more than you need. However, if you want to harness its full potential, you should at least look into some of the paid add-ons for Cesium. Those paid add-ons will streamline your workflow and optimize your data. As a result, your users will ultimately have a better experience.
If you’re trying to decide between Leaflet or Cesium, why not use both? Originally developed as an open source platform for the Australian Government, Terria lets your users choose whether they want to view a two or three dimensional map. And you can probably see where this is going. Leaflet powers Terria’s two dimensional maps, which Cesium is behind its three dimensional maps.
The best feature of Terria, however, is its user interface. Easily organize and browse through a huge number of datasets. It uses Cesium’s 3D data optimization to ensure your map remains fast and responsive, even if your datasets are massive. Use Terria’s slider to compare datasets side-by-side. It even includes a feature for you to build stories with your data and share them with your audience.
I use Terria for all of my mapping needs, and also recommend it for most of my clients. Its power and responsiveness across all devices, including mobile phones, coupled with its flexibility and minimal programming required to set it up make it the optimal platform for me. My users and clients have never complained about being confused using Terria, and are often impressed at how easy it is to analyze huge amounts of data. And best of all, I can set it up so it scales up and down as I need without needing hardly any maintenance.
If you want your travel blog to stand out from the rest, adding fully interactive maps with Terria is one of the easiest and most cost-effective way to do so. To learn more or get started, please get in touch with us or browse our online resources.
If you have a complex dataset, but would prefer not to hire a developer, ESRI’s ArcGIS Online may be the best solution for you. Yes, it does have licensing fees, but you’ll get much of the functionality of the other applications we’ve discussed without needing a developer to set them up for you. Like the other platforms, ArcGIS online can easily handle large numbers of complex datasets, and plot those data on maps that need just a copy and paste to embed in your website. Plus ESRI is widely considered to be the industry standard for anything related to GIS and maps. If anything goes wrong for you, they have excellent documentation and support.
If you’re looking for a real-world example of ArcGIS Online in action, you’ve probably seen them already. Since the COVID-19 pandemic began, most dashboards that display maps of COVID-19 data use ArcGIS online.
Platform | Free | API Fees | Dimensions | Developer | Dynamic Data |
---|---|---|---|---|---|
Google/Bing Maps | Yes | Optional | 2D Only | Not Required | No |
Mapbox | Yes | > 50K loads/mo | 2D Only | Optional | Yes |
Leaflet | Yes | Not Required | 2D Only | Required | Yes |
Cesium | Yes | Not Required | 3D Only | Required | Yes |
Terria | Yes | Not Required | 2D & 3D | Required | Yes |
ArcGIS Online | No | N/A | 2D Only | Not Required | Yes |
Once you have your new interactive map set up on your website or blog, simply adding a link to it is not enough. In addition, you should strategically embed them on different pages of your website to give your users the most immersive experience. For example, you’ll find the Matt Gove Photo Visual Media map embedded both on the home page of this blog and the main photo gallery page on the Matt Gove Photo website, in addition to being embedded above. I’ll continue to add maps as we go forward, too.
To figure out where to embed your interactive maps, have a look at your website’s analytics. Are there pages that have a lot of page views? Is there a specific page people are navigating to that the map could benefit? Are your visitors getting confused and leaving your website instead of navigating somewhere the map could help? Is there a logical place for your maps in your navigation or sales funnel?
Finally, as a travel blogger, you shouldn’t plot just your blog posts on the interactive map. Geotag photo albums, videos, social media posts, guides, fun activities, scenic drives, and much more. Don’t be afraid to make multiple maps, either. Try to use a platform like Terria or ArcGIS Online that organizes your datasets in a logical manner and makes it easy to both add and remove data from the map. If that’s not an option, don’t overwhelm your users with too much data on a single map. That’s one of the best ways to drive visitors off of your website and directly into the arms of your competitors.
Fast, professional quality interactive maps are one of the best ways travel bloggers can stand out from the crowd. Interactive maps are easy and cost-effective to implement and maintain. They’re also incredibly effective at retaining your visitors’ engagement and keeping them on your website. It boggles my mind why so many travel bloggers haven’t taken full advantage of the incredible potential interactive maps present to both grow your audience and keep your existing followers coming back for more.
Are you ready to take the next steps with interactive maps and bring your website or travel blog to the next level? As avid travelers and data science experts who specialize in online GIS and mapping applications, we’d love to help you take that next step in your journey. I invite you to please browse our catalog of GIS and mapping services. Then, get in touch or book a free info session with us to discuss your specific project. We can’t wait to hear from you.
Top Photo: Chapman’s Peak Drive on the Matt Gove Photo Scenic Drives Map
Cape Town, Western Cape, South Africa
The post Travel Bloggers: How to Stand Out with Powerful Interactive Maps appeared first on Matthew Gove Blog.
]]>The post Does Your Website Make These 10 Mistakes with Hero Images? appeared first on Matthew Gove Blog.
]]>When used correctly, a hero image is a great way to make positive first impressions that instantly builds credibility and trust for your brand. Given the popularity of hero images, it’s no surprise that many businesses and organizations that use them often feel like they could be getting more from them. Unfortunately, when you’re dealing with graphics and images, all it takes is one minuscule misstep to send your audience running for the exits.
They say a picture tells a thousand words. That’s especially true with hero images. In fact, the less text that accompanies them, the better. However, keep in mind that reducing the amount of text shifts even more of the burden to your hero image. As a result, it puts even more pressure on you to ensure everything is perfect.
Your hero image should tell your story. Without even reading the text, your audience should have a pretty good idea of as many of the following as possible.
You can find some spectacularly terrible examples of web design from just a quick Google Image search. In this screenshot, can you figure out what this company does without reading any of the text?
Put aside the font and color choices for a sec. A grainy image of a couple puffy clouds tells us nothing about the company! At first glance, you’d have no idea the website was about horses unless you read the text. What makes it even worse is that after reading the first two lines of text, you still have no idea what they do. It’s not until you get to the third line that they reveal that they sell horses.
So how do they make it better? First and foremost, the hero image should have an image of one of their horses. Then add a little personality. If they’re selling show horses, put a picture of one of their horses at a show. Selling to a summer camp? How about a picture of a kid on a horse actively engaging with an instructor? Anything is better than the clouds.
I’ll be the first to admit, I have been guilty of this in the past. Without a call to action, your audience has reached the end of the road. And it’s often a dead end road. With no clear indication of where to go, a small fraction of your audience will poke around your navigation menu. A few more will turn around and back up. But the vast majority of visitors will simple hit the red “X” in the corner and leave. The lack of a clear call to action is the leading cause of prospects exiting your sales funnel.
We can actually use on of my own websites to demonstrate the effect of the lack of a call to action. The Matt Gove Photo site uses a large hero image on the home page with links to my most recent adventures. Because the site is focused on travel and outdoor adventure, the heading and subheading reference the specific adventure and the state or country in which it’s located. Underneath, you’ll find the call to action: links where you can view photos, blog posts, videos, and more. Now, how would you react if you landed on the site and those calls to action had suddenly disappeared?
When you look at that hero image, you really want to see the rest of the photos. But without a call to action, you have nowhere to begin. You’d have to go searching through the whole photo gallery to find them. I don’t know about you, but I’m far too lazy for that. I’d have a quick scroll through the home page and then probably leave.
Thankfully, that example is purely hypothetical. If you visit the site, rest assured that the calls to action are all still there.
Amazing how such a small detail can make such a big difference, isn’t it?
So what should you do if you can’t think of a good call to action? Or maybe there actually is no logical path to your next step? When in doubt, give these a try. Your goal here is to keep your audience engaged, not make a sale.
On the flip side, it’s easy to get caught up and include too many calls to action. In an ideal world, the clearest call to action you can make is to only have one. Having two is okay, especially if one is a “Learn More” link. However, three is pushing it in many circumstances, unless there is a clear and logical reason for it. On the Matt Gove Photo home page, that’s the case. The photos are clearly divided into three parts.
Under no circumstance should you have more than three calls to action associated with your hero image. Even with three, you risk overwhelming and confusing your audience with too many choices. Unfortunately, when you have too many choices, the one you make most often is simply to leave.
Let’s back up in time for a sec. We’ll go back to 2013. At the time, I had little experience when it came to web design and web development. Not surprisingly, I tried to cram way too much into the home page. To say it overwhelmed you with choices is an understatement. Not to mention I needed a few lessons in color theory.
Back in the present day, I cringe big time looking at that. You should too. But we must learn from our mistakes and experiences. My how things have changed since then.
As we discussed at the beginning, your hero image should do most of the heavy lifting for getting your message across. Keep your text to a bare minimum. It should consist of no more than:
There is once exception to this rule when it’s okay to write more than a couple short sentences: coming soon ore pre-launch pages. The reason why? You need to be able to describe both what is coming soon as well as the benefits your audience will get once it launches. If you can do it only one sentence, more power to you. But for most of us, it takes a short paragraph.
If you’ve ever tried to overlay text over any photo of the outdoors, you’ve likely run into this issue. No matter which color you choose for the text, there’s part of the image where you can’t read it. Your first thought may be to make the text multiple colors, but that never looks professional.
Let me let you in on an industry secret. Okay, it’s not really a secret, as just looking at the “Videos Coming Soon” page in the screenshot above gives it away. If you put a semi-transparent overlay on top of the image, the overlay mutes the effect of the high contrast and lets you easily read the text without having to change colors or squint at it from a weird angle. You want to find the perfect balance where the text is easy to read, but you can still clearly see what the image is behind the overlay.
For comparison, here’s the same “Videos Coming Soon” page with the semi-transparent overlay removed. Quite a difference, isn’t it?
There are two ways you can go with regards to size. First, the resolution of your photo may be smaller than the resolution of your screen. For making a professional, trustworthy first impression, it’s a complete disaster if that happens. Not only does such a mistake make you look like an amateur, it also looks like you just don’t care.
Now, I intentionally shrunk the image in a development environment to generate that screenshot. However, if you are dealing with very high-resolution screens (larger than 4K), you may run into an issue where your large hero image starts to adversely affect your page performance. There are a couple ways around it, both of which I employ on the Matt Gove Photo home page.
background-size: cover
CSS property to ensure that both dimensions of the hero image remain greater than or equal to the dimensions of the screen. Be aware that this can make your hero image grainy if its resolution is not optimized for larger screens.Second, your hero image may not be taking up enough real estate on the page. In that case, you can easily argue that it’s no longer a hero image, but that’s a discussion for another day. Your hero image should take up the entire screen regardless of its size, orientation, and resolution. If your viewer’s eye is not immediately drawn to it, you’re doing it wrong.
There is one scenario where it’s perfectly okay to shrink your hero image: to tease what’s below the fold (the content you have to scroll down to see). If you have a look the next time you buy something online, you’ll find many businesses and e-commerce sites apply this strategy. I use it on my business’ website, too.
Your hero image should be one you would frame to hang up in your home or office. People should be oohing and aahing over it. Don’t use crappy images. Ever. If your photography skills aren’t up to snuff, you should either license a photo or hire a professional photographer to take and/or process your photos for you.
When your audience logs onto your website or application, they expect it to load. Fast. If your site takes more than a few seconds to load, you can kiss your audience goodbye. They won’t wait around for it to load. And losing your audience isn’t your only worry. Search engines will punish your website if they detect it’s unnecessarily slow.
Unfortunately, we’re barreling right towards a Catch 22. Large, and often bloated, images are the #1 cause of slow websites. So how do you maintain that lightning fast load time while at the same time being able to use beautiful, high-quality hero images?
Hero images were designed to be used one at a time, and one per page. If one hero image is bogging down your website, imagine what several will do. In addition to the performance issues, you risk overwhelming your audience with choices if you use multiple hero images on the same page. And we all know what happens when you do that.
In addition, sliders and carousels stopped being popular in 2010. Don’t use them. Search engines have a brutally difficult time crawling them, which can have a profoundly negative effect on your search engine optimization. Most SEO and conversion experts agree that they have little use 99% of the time. In addition to bogging down your site, the statistics just don’t justify their use anymore.
Slide | Amount of Clicks |
---|---|
First Slide | 90% |
Second Slide | 8.8% |
Third Slide and Above | 1.7 to 2.3% |
If you feel you’ve done everything right with your hero image and still aren’t getting conversions, it may mean that hero images aren’t for you. There’s nothing wrong with that. Maybe you have home page content that is constantly being updated. Have a look at any news site out there. None of them use hero images. The same goes for certain e-commerce businesses. Amazon, Walmart, Home Depot, and Best Buy don’t use them, either.
If you don’t feel hero images right for you, don’t use them. Yes, they’re all the hype right now. And yes, they can be absolutely gorgeous. But they’re not for everyone. You’re the only one who can make the decision as to what’s best for you.
Well, that’s about enough butchering of my own websites as I can take. When used correctly, hero images can convert at an incredible rate and boost your credibility to levels you didn’t think possible. Unfortunately, that’s an incredibly difficult needle to thread. Hero images are astonishingly easy to screw up. I’ve been building websites since 2008 and I still find new ways to make mistakes.
However, we must continue to learn from our mistakes. Use analytics to your advantage. They’ll tell you why your hero image is not converting. Please reach out to us if you need any help. With our expertise in data science, web development, and graphic design, we’ll help you process your analytics and make sure that your hero image becomes a magnet for leads. The worst thing you can do is let it frustrate you.
Top Photo: A Desolate Road to Nowhere
Death Valley National Park, California – February, 2020
The post Does Your Website Make These 10 Mistakes with Hero Images? appeared first on Matthew Gove Blog.
]]>The post COVID-19 Dashboard Upgrades: 3 Phases That Will Help You Make Better Decisions appeared first on Matthew Gove Blog.
]]>Ever since the map launched back in April, I was not fully satisfied with the limitations on the available data as well as the lack of functionality on the map for the COVID-19 data by state and province. Once it became clear that the fall/winter wave in the United States was going to be really bad, it was time to upgrade the dashboard. The last thing I want is for people to get sick because of the lack of data and functionality on the COVID-19 dashboard and map.
The first order of business is to greatly expand the available data coverage within each country on the map. In addition to the map already displaying data by state/province for the United States and Canada, we have added support for Australia and Mexico as well. We will add even more countries in Phase 3 to bring the total to seventeen.
Next, we expanded the data fields that can be plotted on the map. We started with the old set of parameters.
We added the following parameters.
With the pandemic raging so badly out of control in the United States, I wanted an easy way to assess general risk for normal day-to-day activities in public. These activities could include running errands, exercising, going to restaurants, and much more. Please be aware that I still consider the index to be in a “Beta” phase, and it will likely receive minor tweaks over the next few weeks.
The index is a weighted average comprised of the number of active cases, the odds any one person you cross paths with is infected, the daily new cases, and the 14-day trend in cases. We account for population by evaluating these parameters per capita. As a result, the index can be evaluated at the country, state/province, or county level. You can easily compare countries to provinces, states to counties, and more.
Index Value | Description |
---|---|
Less than 5 | Virus is Under Control |
5 to 10 | Low or Marginal Risk |
11 to 20 | Enhanced Risk |
21 to 30 | Medium or Moderate Risk |
31 to 40 | Elevated Risk |
41 to 50 | High Risk |
51 to 60 | Critical Risk |
61 to 75 | Extreme Risk |
Greater than 75 | Catastrophic Risk |
Any value greater than 40 is considered a Particularly Dangerous Situation, and your interactions with the public should be kept to a minimum.
Adding the time series charts to all datasets on the map and streamlining the process of selecting which parameter to display have been a top priority since April. The COVID-19 dashboard map now includes these features for all datasets.
In addition to being able to plot data by country or state/province, we also wanted to greatly expand the available datasets. The data catalog now includes the following. Unless otherwise noted, there is currently support for Australia, Canada, Mexico, and the United States. We will be implementing support for additional countries in Phase 3.
Phase 1 launched on Friday, 18 December, 2020.
While many of the new features are on the map, the dashboard is getting a few updates as well. They may not be as significant as the map updates, but they’ll make an impact.
The database rebuild allowed us to expand the countries that have data broken down by state from 3 to 17. The Plot by State tab will be updated to reflect those changes. In addition to states, you’ll also be able to plot territories’ COVID-19 data.
Additionally, the x-axis on the charts will default to showing the calendar date instead of the number of days since the 100th confirmed case. Like the other settings, there will be a menu to select which parameter you want to display on the x-axis.
Our COVID-19 model has gotten several minor updates over the past few weeks. A lot of things have changed since the spring, so the model has been updated to better reflect them. In addition, model output ranges have been refined to much more accurately and realistically show projected outcomes.
Since we added many more states to the database, I will be including additional states in each model run as well. I may include territories at a future date, but I currently have no plans to because case loads in the territories are so low. The states and provinces added to the model in this update include:
Additional states and provinces will be added to the model runs as case counts dictate.
I hope to launch Phase 2 by 31 December, 2020.
Since we’ve expanded our dataset to now include state and provincial data for 17 countries, it would be foolish not to be able to plot those data on the map. The full list of countries spans 5 continents and by the end of Phase 3 will include the following.
Phase 3 will launch in early January, 2021.
As the COVID-19 pandemic continues to rage across the globe, access to complete, easy-to-interpret data and maps is critical to winning the fight against it. I hope these new updates to our COVID-19 dashboard go a long way towards accomplishing that goal. If there’s anything you feel is missing from the dashboard or map, please let me know in the comments below, and I will address them as soon as possible.
Links: Visit the COVID-19 Dashboard or the COVID-19 Map
Top Photo: The New Matt’s Risk Index Evaluated on a Map for All US Counties on 19 December, 2020
The post COVID-19 Dashboard Upgrades: 3 Phases That Will Help You Make Better Decisions appeared first on Matthew Gove Blog.
]]>The post A 15-Minute Intro to Supercharging Your GIS Productivity with Python appeared first on Matthew Gove Blog.
]]>It’s no secret that the future of big data is here. As your datasets get larger, it’s much more efficient to keep the data in a databases instead of embedded in your geography files. Instead of manually copying the data between your GIS files and the database, why not let Python do the heavy lifting for you?
I am in the process of expanding my COVID-19 dashboard to be able to plot data by state for many more countries than just Australia, Canada, and the United States.
I am also expanding the US dataset to break it down as far as the county level. In order to do so, I had to add all 3,200-plus counties to my geodatabase. Manually entering datasets of this scale in the past have taken months. With Python, I completed the data entry of all 3,200-plus counties in less than 20 minutes. Using QGIS, the Python script completed the following steps.
#!/usr/bin/env python3
from qgis.core import QgsProject
import csv
# Step 1: Open the shapefile
qgis_instance = QgsProject.instance()
shp_counties = qgis_instance.mapLayersByName("US Counties")[0]
county_features = shp_counties.getFeatures()
# Initialize a list to store data that will be written to CSV
csv_output = [
["CountyID", "County Name", "State Name", "Population"
]
# Initialize our unique ID that is used in Step 3
county_id = 1
# Step 2: In each row, identify the county, state, and population
for county in county_features:
fips_code = county.attribute(0)
county_name = county.attribute(1)
state_name = county.attribute(2)
population = county.attribute(3)
# Define output CSV row
csv_row = [county_id, county_name, state_name, population]
# Add the row to the output CSV data
csv_output.append(csv_row)
# Increment the county_id unique identifier
county_id += 1
# Step 4: Write the data to CSV
with open("county-data.csv", "w") as ofile:
writer = csv.writer(ofile)
writer.writerows(csv_output)
Have you ever been deep in a geospatial analysis and discovered that you needed to create a graph of the data. Whether you need to plot a time series, a bar chart, or any other kind of graph, Python comes to the rescue again.
Python’s matplotlib
library is the gold standard for data analysis and figure creation. With matplotlib
, you can generate publication-quality figures right from the Python console in your GIS program. Pretty slick, huh?
My COVID-19 model uses matplotlib
to generate time series plots for each geographic entity being modeled. While I run the model out of a Jupyter Notebook, you could easily generate these plots from within a GIS program.
matplotlib
time series chart that my COVID-19 model generatesThe model is over 2,000 lines of Python code, so if you want see it, please download it here.
In addition to matplotlib
, the Python pandas
library is another great library for both data science and GIS. Both libraries come with a broad and powerful toolbox for performing a statistical analysis on your dataset.
Since we’re still in the middle of the raging COVID-19 pandemic, let’s do a basic statistical analysis on some recent COVID-19 data. We’ll do a basic statstical analysis on some confirmed cases and deaths data in the United States. Let’s have a look at new daily cases by state.
As a simple statistical analysis, let’s identify the following values and the corresponding states.
Even though I store these data in a database, and these values can easily be extracted with database queries, let’s assume that the data are embedded in a shapefile or CSV file that looks like this.
State | New Daily COVID-19 Cases |
---|---|
Pennsylvania | 10,247 |
Arkansas | 2,070 |
Wyoming | 435 |
Georgia | 5,320 |
California | 29,415 |
First, regardless of the order that the data in the table are in, we want to break the columns down into 2 parallel arrays. There are more advanced ways to sort the data, but those are beyond the scope of this tutorial.
import csv
# Initialize parallel arrays
states = []
new_cases = []
# Read in data from the CSV file
with open("covid19-new-cases-20201210.csv", "r") as covid_data:
reader = csv.reader(covid_data)
for row in reader:
# Extract State Name and Case Count
state = row[0]
new_case_count = row[1]
# Add state name and new cases to parallel arrays.
states.append(state)
new_cases.append(new_case_count)
Now, let’s jump into the statistical analysis. The primary advantage of using parallel arrays is we can use the location of the case count data in the array to identify which state it comes from. As long as the state name and its corresponding case count are in the same location within the parallel arrays, it does not matter what order the CSV file presents the data.
First up in our statistical analysis is to identify the states with the most and fewest new daily COVID-19 cases. We’ll take advantage of Python’s build-in statistical functions. Variable names are consistent with the above block of code.
most_cases = max(new_cases)
fewest_cases = min(new_cases)
When you run this, you’ll find that most_cases = 29,415
and fewest_cases = 101
. Now we need to determine which states those values correspond to. This is where the parallel arrays come in. Python’s index()
method tells us where in the new_case_count
array the value is located. We’ll then reference the same location in the states
array, which will give us the state name.
most_cases_index = new_cases.index(most_cases)
most_cases_state = states[most_cases_index]
fewest_cases_index = new_cases.index(fewest_cases)
fewest_cases_state = states[fewest_cases_index]
When you run this block of code, you’ll discover that California has the most new daily cases, while Hawaii has the fewest.
While Python does not have a built-in averaging or mean function, it’s an easy calculation to make. Simply add up the values you want to average and divide them by the number of values. Because the data is stored in arrays, we can simple sum the array and divide it by the length of the array. In this instance, the length of the array is 51: the 50 states, plus the District of Columbia. It’s a very tidy one line of code.
mean_new_daily_cases = sum(new_cases) / len(new_cases)
Rounding to the nearest whole number, the United States experienced an average of 4,390 new COVID-19 cases per state on 10 December.
Python does not have a built-in function to generate the median of a list of numbers, but thankfully, like the mean, it’s easy to calculate. First, we need to sort the array of new daily cases from smallest to largest. We’ll save this into a new array because we need to preserve the order of the original array and maintain the integrity of the parallel arrays.
sorted_new_cases = sorted(new_cases)
Once the values are sorted, the median is simply the value of the middle-most point in the array. Because we included the District of Columbia, there are 51 values in our array, so we can select the middle value (the 25th item in the array). If we only used the 50 states, we would need to average the two middle-most values. In Python, it looks like this. The double slash means that you round the division down to the nearest whole number.
middle_index = len(sorted_new_cases) // 2
median_new_cases = sorted_new_cases[middle_index]
The median value returned is 2,431 new cases. Now we need to figure out which state that value belongs to. To that, just do the same thing we did when calculating the max and min values. Look in the original new_cases
array for the value and look in that same location in the states
array.
median_cases_index = new_cases.index(median_new_cases)
median_state = states[median_cases_index]
On 10 December, the state with the median new daily cases was Connecticut.
To determine how many states are seeing more than 10K, 5K, and 3K new daily cases, we simply need to count how many values in the new_cases
array are greater than those three values. Using 3,000 as an example, it can be coded as follows (the “gt” stands for “greater than”).
values_gt_3000 = []
for n in new_cases:
if n > 3000:
values_gt_3000.append(n)
Thankfully, Python gives us a shorthand way to code this so we don’t wind up with lots of nested layers in the code. The block of code above can be compressed into just a single line.
values_gt_3000 = [n for n in new_cases if n > 3000]
To get the number of states with more than 3,000 new daily cases, recall that the len()
function tells us how many values are in an array. All you need to do is apply the len()
function to the above array and you have your answer.
num_values_gt_3000 = len([n for n in new_cases if n > 3000])
num_values_gt_5000 = len([n for n in new_cases if n > 5000])
num_values_gt_10000 = len([n for n in new_cases if n > 10000])
When you run that code, here are the values for the 10 December dataset. You can determine which states these are by either looking at the map above or using the same techniques we used to extract the state in the max, min, and median calculations.
New Daily Case Cutoff | Number of States |
---|---|
> 10,000 | 4 |
> 5,000 | 13 |
> 3,000 | 24 |
As many of you know, I built my own COVID-19 model last spring. The model is written in Python and extensively uses the matplotlib
library. The model predicts the number of cases both two weeks and one month out as well as the apex date of the pandemic. While I primarily focus on the 50 US states and 6 Canadian provinces, you can run the model for any country in the world, and every state in 17 countries.
While many models, whether it be weather models, COVID-19 models, and more make heavy use of maps, there’s more to modeling than just maps. Remember above when we generated publication-quality graphs and figures right from within your GIS application using matplolib
? You can do the same thing for you non-geospatial model outputs.
bokeh
, a Python library that creates interactive plots for use in a web browser.Have a shapefile or other geographic file that you need to make bulk edits on? Why not let Python do the heavy lifting for you. Let’s say you have a shapefile of every county in the United States. There are over 3,200 counties in the US, and I don’t know about you, but I have little to no interest in entering that much data manually.
In the attribute table of that shapefile, you have nothing but the Federal Information Processing Standards (FIPS) code for each county. FIPS codes are standardized unique indentifiers the US Federal Government assigns to entities such as states and counties. You want to add the county names and the county’s state to the shapefile.
In addition, you also have a spreadsheet, in CSV format of all the FIPS codes, county names, and which state each county is located in. That table looks something like this.
FIPS Code | County Name | State Name |
---|---|---|
04013 | Maricopa | Arizona |
06037 | Los Angeles | California |
26163 | Wayne | Michigan |
12086 | Miami-Dade | Florida |
48201 | Harris | Texas |
13135 | Gwinnett | Georgia |
To add the data from this table into the shapefile, all you need is a few lines of Python code. In this example, the code must be run in the Python console in QGIS.
#!/usr/bin/env python3
from qgis.core import QgsProject
import csv
# Define QGIS Project Instance and Shapefile Layer
instance = QgsProject.instance()
counties_shp = instance.mapLayersByName("US Counties")[0]
# Load features (county polygons) in the shapefile
county_features = counties_shp.getFeatures()
# Define the columns in the shapefile that we will be referencing.
shp_fips_column = counties_shp.dataProvider().fieldNameIndex("FIPS")
shp_state_column = counties_shp.dataProvider().fieldNameIndex("State")
shp_county_column = counties_shp.dataProvider().fieldNameIndex("County")
# Start Editing the Shapefile
counties_shp.startEditing()
# Read in state and county name data from the CSV File
with open("fips-county.csv", "r") as fips_csv:
csv_data = list(csv.reader(fips_csv))
# Loop through the rows in the shapefile
for county in county_features:
shp_fips = county.attribute(0)
# The Feature ID is internal to QGIS and is used to identify which row to add the data to.
feature_id = county.id()
# Loop through CSV to Find Match
for row in csv_data:
csv_fips = row[0]
county_name = row[1]
state_name = row[2]
# If CSV FIPS matches the FIPS in the Shapefile, assign state and county name to the shapefile row.
if csv_fips == shp_fips:
county.changeAttributeValue(feature_id, shp_county_column, county_name)
county.changeAttributeValue(feature_id, shp_state_column, state_name)
# Commit, Save Your Changes to the Shapefile, and Exit Edit Mode
counties_shp.commitChanges()
iface.VectorLayerTools().stopEditing(counties_shp)
We all hate doing monotonous busy work. Who doesn’t, right. Most metadata entry falls into that category. There is a surprising amount of metadata associated with GIS datasets.
In the above section, we edited the data inside of shapefiles and geodatabases with Python. It turns out that data is not the only thing you can edit with Python. Automating metadata entry and updates with Python is one of the easiest and most efficient ways to boost your GIS productivity and focus on the tasks you need to get done.
In today’s era of big data, the future of GIS is in vector tiles. You’ve probably heard of vector images, which are comprised of points, lines, and polygons that are based on mathematical equations instead of pixels. Vector images are smaller and much more lightweight than traditional images. In the world of web development, vector images can drastically increase a website’s speed and decrease its load times.
Vector images can also be applied to both basemaps and layer geometries, such as state or country outlines. In the context of GIS, they’re called vector tiles, and are primarily used in online maps. Vector tiles are how Google Maps, Mapbox, and many other household mapping applications load so quickly.
Region mapping files simply use a unique identifier to map a feature in a vector tile to a row in a data table. The data in the table is then displayed on the map. Let’s look at an example using my COVID-19 Dashboard’s map. While I use my own unique identifiers to do the actual mapping, in this example, we’ll use the ISO country codes to map COVID-19 data in Europe. The data table is simply a CSV file that looks like this. For time series data, there would also be a column for the timestamp.
Country Code | Total Cases | Total Deaths | New Cases | New Deaths |
---|---|---|---|---|
de | 1,314,309 | 21,567 | 22,399 | 427 |
es | 1,730,575 | 47,624 | 6,561 | 196 |
fr | 2,405,210 | 57,671 | 11,930 | 402 |
gb | 1,814,395 | 63,603 | 17,085 | 413 |
it | 1,805,873 | 63,387 | 16,705 | 648 |
nl | 604,452 | 10,051 | 7,345 | 50 |
pt | 340,287 | 5,373 | 3,962 | 81 |
The region mapping file is a JSON (JavaScript Object Notation) file that identifies the valid entity ID’s in each vector tiles file. In this example, the entity ID is the Country Code. We can use Python to generate that file for all country codes in the world. I keep all of the entity ID’s in a database. Databases are beyond the scope of this tutorial, but for this example, I queried the country codes from the database and saved them in an array called country_codes
.
# Initialize an array to store the country codes
region_mapping_ids = []
# Add all country codes to the region mapping ID's
for country_code in country_codes:
region_mapping_ids.append(country_code)
# Define the Region Mapping JSON
region_mapping_json = {
"layer": "WorldCountries",
"property": "CountryCode",
"values": region_mapping_ids
}
# Write the Region Mapping JSON to a file.
with open("region_map-WorldCountries.json", "w") as ofile:
json.dump(region_mapping_json, ofile)
In the Region Mapping JSON, the layer
property identifies which vector tiles file to use. The property
property identifies which attribute in the vector tiles is the unique identifier (in this case, the country code). Finally, the values
property identifies all valid unique ID’s that can be mapped using that set of vector tiles.
When you put it all together, you get a lightweight map that loads very fast, as the map layers are in vector image format, and the data is in CSV format. Region mapping region shines when you have time series data, such as COVID-19. Here is the final result.
Python is an incredibly powerful tool to have in your GIS arsenal. It can boost your productivity, aid your analysis, and much more. Even though this tutorial barely scratches the surface of the incredible potential these two technologies have together, I hope this gives you some ideas to improve your own projects. Stay tuned for more.
Top Photo: Beautiful Sierra Nevada Geology at the Alabama Hills
Lone Pine, California – Feburary, 2020
The post A 15-Minute Intro to Supercharging Your GIS Productivity with Python appeared first on Matthew Gove Blog.
]]>The post Digging Deeper: Diagnosing My Raspberry Pi Temperature Sensor’s “Hot Flashes” appeared first on Matthew Gove Blog.
]]>Before I begin troubleshooting an issue, I always brainstorm a list of at least 2-3 things that I think may be causing the problem. In no particular order, some of the potential causes I came up with were:
Before diving into troubleshooting, I always see if there are any causes I can reasonably eliminate. In this case, a malfunctioning or defective sensor is highly unlikely to be causing the problem. The sensor is an MPL3115A2 I2C digital barometric, temperature, and altitude sensor. It has been in service on the Raspberry Pi since last May and has taken nearly 1.2 million measurements. Of those 1.2 million measurements, the QA/QC algorithm has flagged 590 of them. That works out to 0.04% of measurements being bad, or a 99.96% measurement accuracy rate.
Of the remaining two possibilities, I always start with whatever is easiest to troubleshoot. In this case, it would be a possible bug in the Python code. On the Raspberry Pi, Python interprets sensor readings and logs them in the database. Intermittent electrical problems are a nightmare to diagnose in the best of situations. The first thing to do is to look at the data and try to identify any patterns. If you’ve ever watched any detective shows on TV, you know what they always say: evidence doesn’t lie.
One pattern should jump out at you immediately. As air temperatures go to, and drop below freezing (0°C), the sensor readings on the Raspberry Pi go haywire. It is also important to note here that this does not rule out an electrical problem causing the problem. To strengthen our case, let’s query the sensor database to see if the sensor recorded any temperatures below freezing. We want to see no instances of the sensor recording temperatures below freezing.
We can now say with 100% certainty that something happens when the temperature hits freezing. However, we still don’t know what. To fully rule out an electrical issue causing the problem, it’s time to get on Google and research how the sensor works.
According to the manufacturer, there are three pieces of information that will help us here.
To finally be able to rule out an electrical problem, let’s focus on item #2. The sensors not being able to handle signed (negative) values in their raw format. With any kind of digital sensor, if it gets a reading that’s off the scale, it loops around to the opposite end of the scale.
For example, if a sensor had a scale of 0 to 400 and got a reading of -1, it would output a value of 399. A reading of -5 would output 395. It works on the other end of the scale, too. A reading of 410 would output a value of 10. During my meteorology studies at the University of Oklahoma, we routinely observed this phenomenon while measuring wind speeds inside of tornadoes with doppler radar.
Item #2 is such a key piece of the puzzle because the problem occurs as soon as temperatures drop below 0°C. With this information, we can firmly establish the bottom end of the temperature sensor’s scale. Now all we had to do was figure out what the top of that scale was. For that, we have to look at the Python code, in which I found the following equation:
raw_temperature = ((raw_temperature_msb * 256) + (raw_temperature_lsb & 0xF0)) / 16
In this line of code, Python converts the raw temperature reading to standard decimal. Python reads in the sensor’s raw hexadecimal readings bit-by-bit before converting them to Celsius and sending the to the Raspberry Pi. Each reading from the sensor contains a maximum of 2 bits. For example, if this formula was used on a standard Base 10 number, such as 27, it would be read as:
temperature = (10 * 2) + (1 * 7) = 20 + 7 = 27
While that information may not seem significant on its own, it gives us enough information to calculate the upper end of the sensor’s temperature scale. Remember in hexadecimal, each bit can contain a value up to 16, so calculating the upper bound (in degrees Celsius) is simply:
Upper Bound = 16 * 16 = 256
Now that we have established that the sensor has a raw output range of 0 to 256°C, let’s look back at the query results. Keep in mind what I mentioned above about off-the-scale readings looping around to the other end of the scale.
To calculate the actual temperature, simply subtract 256 from the sensor readings in the 250’s. For example, if you look at the last entry in the above table, the actual temperature would be:
Actual Temp = 254.625 – 256 = -1.375°C
Using the above query as an example, the correct actual temperatures using the loop-around equation would be:
Implementing the fix before the sensor readings even reaches the QA/QC algorithm on the Raspberry Pi is simple. We simply need to add an if statement to the Python code that converts the raw hexadecimal sensor readings to degrees Celsius. Here is the original code:
try:
raw_temperature = ((raw_temperature_msb * 256) + (raw_temperature_lsb & 0xF0)) / 16
unrounded_temperature = raw_temperature / 16
except:
raw_temperature = None
unrounded_temperature = None
With the if statement added to fix the problem, the above block of code simply becomes:
try:
raw_temperature = ((raw_temperature_msb * 256) + (raw_temperature_lsb & 0xF0)) / 16
unrounded_temperature = raw_temperature / 16
if unrounded_temperature > 70:
unrounded_temperature -= 256
except:
raw_temperature = None
unrounded_temperature = None
The final piece of the puzzle is to update the bad data points in the database to ensure data integrity. Amazingly, we can do that with a simple UPDATE command so we don’t lose any data. Full disclosure, this is not the actual database structure in the weather station. It’s just worded that way to make it easier to understand.
UPDATE `sensor_measurement`
SET `measurement_value` = `measurement_value` - 256
WHERE `measurement_type` = 'temperature'
AND `measurement_value` >= 70;
Well, this has certainly been an interesting one. It’s always a relief to confirm that the sensor is functioning exactly as it should and the problem is nothing more than an easy-to-fix bug in the Python code on the Raspberry Pi. Until next time.
The post Digging Deeper: Diagnosing My Raspberry Pi Temperature Sensor’s “Hot Flashes” appeared first on Matthew Gove Blog.
]]>The post Houston, I Think There’s a Bug in My Weather Station’s QA/QC Algorithm appeared first on Matthew Gove Blog.
]]>I recently logged onto my weather station to check some high and low temperatures for the month of January. While I was casually scrolling through the data, I caught something out of the corner of my eye. When I looked a little closer, I had to do a double take.
What was even more impressive were the “heat indices”. They make summers on the Persian Gulf, where heat indices can reach ridiculous levels, look like absolute zero. For comparison, the temperature of the surface of the sun is 9,941°F.
Both my and several friends’ initial reaction was to cue the jokes: “Well, it does get hot in Arizona…”, but unfortunately on a logic and reasoning test, science will beat humor every single time. Time to figure out why this happened. First, we need to run a query for all raw temperature data my sensors recorded that were greater than 60°C (140°F). I chose that cutoff based on the hottest temperature ever recorded in Arizona. On June 29, 1994, Lake Havasu City topped out at a sizzling 53°C (128°F).
Amazingly, the query returned almost 300 hits. Here is a small sample of them.
Sensors getting screwy readings like this is part of the deal when you operate any kind of data logger like this. I am much more concerned that so many bad data points managed to slip through my QA/QC algorithm on the Raspberry Pi. I’ll admit the QA/QC algorithm was the very basic one I wrote to just “get things going”. It was long overdue for an upgrade, but still, it should have caught these.
Once I queried the dates where these bad data points occurred, the culprit was revealed.
You may recall that this past December, I had to replace the analog temperature and humidity sensor that broke. I formally decommissioned the broken sensor on December 14, 2019. Did you happen to notice the first date of bad data? That’s not a coincidence.
So what caused the QA/QC algorithm to nod off and miss these bad data points? The answer goes back to the broken analog sensor. The broken sensor measured both temperature and humidity. When the relative humidity reading hiccuped, often showing values greater than 3,000% when it did, the corresponding temperature reading would get thrown off by about 10-20°C.
The problem is that 99% of those bad temperature readings were between -5 and 15°C (23 to 59°F). During the winter months, we see actual temperatures inside that range every day here in the Phoenix area, so you can’t simply filter them out. I wrote the original QA/QC algorithm to flag relative humidity values that were greater than 100% or less then 0%. I would deal with the temperature parameter when I updated the algorithm.
The new digital sensor I installed only measures altitude, barometric pressure, and temperature. As a result, the Raspberry Pi reverted to obtaining its humidity data from the National Weather Service. The NWS data is already QA/QC’d. Because my original QA/QC algorithm only flagged humidity and not temperature, it deemed every data point that passed through it “OK”, thus rendering the algorithm redundant.
To confirm that this is the issue, the database puts the data that the QA/QC algorithm flags into a separate table in the sensor database. I use that data for troubleshooting and making improvements to the algorithm. A simple query will reveal the dates of temperatures I have flagged. If swapping the sensors did in fact make the QA/QC algorithm redundant, the query will only return dates on or after the sensor replacement. I replaced the sensor on December 14, 2019.
Thankfully, fixing the issue requires nothing more than a few lines of code to add an if statement to the algorithm so it flags temperatures that are outside of an acceptable range of -20 to 60°C. (-4 to 140°F). I chose the upper limit based on the hottest temperature ever recorded in Arizona (53°C/128°F). At the other end of the spectrum, I based the lower bound off of the coldest temperature ever recorded in Phoenix (-9°C/16°F). I will tweak that range as needed.
My goal is to continually add small upgrades and fixes to the QA/QC algorithm over the next year. By the time I have the complete network of sensors up and running, it will be up to a level of complexity that is acceptable for a hobby weather station. At the same time, I want it to be held as close to professional standards as I can. Stay tuned for future posts where we will look closer at what happens in the data logger’s electrical system to cause such wacky temperature readings.
The post Houston, I Think There’s a Bug in My Weather Station’s QA/QC Algorithm appeared first on Matthew Gove Blog.
]]>The post Troubleshooting a Raspberry Pi Sensor Gone Awry appeared first on Matthew Gove Blog.
]]>Over the summer, I noticed something odd after a monsoon thunderstorm had gone through. The outdoor relative humidity on my weather station read about 20% higher than the outdoor humidity on my thermostat. I wrote it off as an isolated incident related to the monsoon storm and didn’t think much of it.
As the summer progressed, I noticed the same thing would happen whenever a monsoon thunderstorm would pass over us. The weather station would report the relative humidity about 20% higher than it actually was. By the end of the summer, it was reading 20% high all the time, even in sunny weather.
On several occasions, the idea popped into my head to just go into the weather station’s Python code and subtract 20% off of the relative humidity values the sensor was reading. Trust me, it was very tempting. I had to keep telling myself that doing so would open up a whole new set of problems.
What happens if the sensor starts working properly and gives a reading below 20%. I wrote the Python code. I know that negative relative humidity values will crash the system. When you calculate the dewpoint, you would have to take the natural log of a negative number, which is undefined.
It turned out that my instinct to not tinker with the source code was correct. Over the course of the fall, that 20% gap between the sensor’s readings and the actual humidity grew to 30%, then 40% and 50%. By early November, the weather station would be reporting 80-90% relative humidity when actual humidity values were in the 10-15% range. That would have made a big, ugly mess of the source code, and would not get me any closer to fixing the problem.
By the 1st of December, the sensor would only give humidity readings of 100%, regardless of what the actual humidity was. Pulling the raw sensor data from the main database on the Raspberry Pi confirmed this. While I pondered my next move, I changed the weather station’s Python code. It would now get relative humidity data from the National Weather Service instead of the sensor.
Looking at the above table piqued my interest enough to dump all of the relative humidity data out of the database that contains all of the sensor data and take a look at it. The sensor went into service in early May and takes readings once per minute, so the query returned about 300,000 data points. After crashing Microsoft Excel’s plotting tool several times (I have an old, cranky PC), I adjusted the query to only return the relative humidity values at the top of every hour.
Now, before I show you the marked up plot explaining everything, consider the following. Think of how it can help show when the humidity sensor went off the rails.
When you put everything together, it should become clear where the sensor goes awry.
Unfortunately, the only remedy at this point is to replace the sensor, preferably one of much higher quality. The old sensor that broke was a cheap analog one I paid about $5 for. You get what you pay for, right? Thankfully, the pressure sensor, which is a digital I2C sensor, also measures temperature. With some minor rewiring, I can simply swap them out.
The nice thing about I2C sensors is that you can connect multiple sensors to the system using only one data cable, so adding additional sensors is easy at either end of the wire. I can add them either in the solar radiation shield or in the waterproof housing that houses the Raspberry Pi and the power supply. Additional sensors will get connected where the four gray wire nuts are in the above picture. I will definitely be adding an I2C humidity sensor to the Raspberry Pi and likely more as well. Stay tuned.
The post Troubleshooting a Raspberry Pi Sensor Gone Awry appeared first on Matthew Gove Blog.
]]>The post DIY Weather Station, Part 4: Database Configuration and Programming the Sensor Readings appeared first on Matthew Gove Blog.
]]>The flow of data from the sensors to the weather station’s primary database, is as follows. The primary database contains the full set of weather data. It logs data every 5 minutes. The weather station’s web interface displays those data. Data for which I do not have sensors are obtained from the National Weather Service.
I could easily write an entire post about database design, so I’m not going to go into too much detail of why I made the decisions to design the database the way I did. Instead, here is my list of requirements for the database:
From that list of requirements, this is the EER diagram I came up with for the sensor database.
Note here that the “good” data are put into the “measurement” table, while the “bad” data flagged by the QA/QC mechanism are put into the “measurement_to_qaqc” table.
The Python script is where all the magic happens. Its main purpose is to QA/QC the data it reads from the sensors, essentially acting as a guard so bad data points do not get into the weather station’s primary database.
The first iteration of the QA/QC algorithm is quite simple, and is to a degree, a bit crude. Essentially all it does is to just ensure that the data are within acceptable ranges. For example, the script flags and removes relative humidity readings of 3,285%, as well as the temperature readings of -145°C.
I haven’t begun coding a version 2 of the QA/QC algorithm, but it will look at trends in the data and eliminate unrealistic spikes. For example, if the sensors show that the temperature rose 20°C over the course of a couple minutes, that would be flagged and removed by the QA/QC algorithm. In the current first version, you could in theory have temperature readings of 10°C and 30°C within a minute of each other and the algorithm would not flag it, because both temperature values are within the acceptable or realistic temperature range.
Within the Python code, there are classes for each type (make/model) of sensor. To instantiate an object of one of these classes, they are passed the GPIO pin in the data logger to which they are connected, as well as the Sensor ID, which is the primary key in the “sensor” table in the database described above.
The sensor classes also perform the following:
The primary script in the data logger Python module performs the following steps to log the sensor data:
Well that just about wraps up the sensor and data logger portion of the DIY weather station project. I certainly had a blast designing and building, and I hope you enjoyed reading about it. Until next time.
The post DIY Weather Station, Part 4: Database Configuration and Programming the Sensor Readings appeared first on Matthew Gove Blog.
]]>The post How to Set Up a Website and Database to Support Multiple Languages appeared first on Matthew Gove Blog.
]]>Today, I’m going to be giving you a quick tutorial on how to support multiple languages on your website using a relational database, regardless of whether you’re supporting 2 languages or 200, in a manner that easily scales up and down. There are other ways besides databases to support multiple languages, but databases are the easiest way to keep everything organized.
I don’t really advertise it, but my professional portfolio website is actually bilingual and is available in both English and French. To change languages, click on the link at the very bottom of the footer that says “Français” or “English”. I’m hoping to add Spanish as well once my Spanish language skills improve to that level. I think using online translators is poor form for a professional website, as they often struggle with grammar and verb conjugations, and your business really shouldn’t be offering services in a language without having an employee who can speak the language fluently.
Setting up the proper database schema to both support multiple languages and scale easily the the key to this tutorial. If not done properly, you will get either one or the other. In its most basic form, a database table that supported only English would look something like this:
id | phrase |
---|---|
3 | Hello World! |
I know what you’re probably thinking right now. It must be simple to add support for a second language. We could simply add another column to the database table, like this:
id | phrase_en | phrase_fr |
---|---|---|
3 | Hello World! | Bonjour le monde! |
Unfortunately, as simple as this seems, this is not the correct way to do it. I will admit, I made this exact mistake when I first started adding support for the French language on my website. As soon as you start running queries, you realize why this doesn’t work. You either need to select both languages to keep the query simple or write a complex query that needs to determine which column to select.
Then, there’s also the issue of scalability. While this may seem like a perfectly valid solution for supporting 2 or 3 languages, what would this look like if you were supporting 200 languages? Every time you wanted to add or remove a language, you would need to add or delete a column from every column in the database. Calling that a tedious job is an understatement at best.
The International Standards for Normalization (ISO) maintains 2-letter codes for all of the world’s common languages, which fall under the ISO-639 standard. Examples of these codes include “en” for English, “fr” for French, “es” for Spanish, “it” for Italian, and “ar” for Arabic. In the real world, you probably want to use your own foreign keys with a “language” table, but for this example, I’ll use the ISO language to make it easier to understand. The correct way to set up your database table for multiple languages is:
id | language | phrase |
---|---|---|
3 | en | Hello World! |
4 | fr | Bonjour le monde! |
In addition to being properly normalized, this table makes queries much easier. If you want the English phrase, just add “WHERE language = ‘en'” to your query. If you want the French phrase, add “WHERE language = ‘fr'”. It also scales up and down nicely, as you just need to add and delete records (rows) to add or remove support for additional languages. This way, I can quickly add multiple languages at a time with just a single query, such as:
id | language | phrase |
---|---|---|
3 | en | Hello World! |
4 | fr | Bonjour le monde! |
5 | es | ¡Hola el mundo! |
6 | it | Ciao il mundo! |
Finally, when creating a database table that supports multiple languages, you must encode the table as UTF-8, or else accented characters will not display correctly. To ensure everything displays properly, I usually encode the string into UTF-8 a second time with the programming language I’m using to extract the data from the database. I know that both PHP and Python have that functionality. UTF-8 encoding will also work for languages that do not use the Latin alphabet, such as Greek, Russian, and Chinese.
id | language | phrase |
---|---|---|
3 | en | Hello World! |
4 | fr | Bonjour le monde! |
5 | es | ¡Hola el mundo! |
6 | it | Ciao il mundo! |
7 | gr | Γειά σου Κόσμε! |
8 | ru | Привет мир! |
9 | ar | !مرحبا بالعالم |
10 | zh | 你好,世界 |
11 | th | สวัสดีชาวโลก |
See how nicely that scales when you have more than just 2 or 3 languages. If you write your back end code correctly, you shouldn’t need to change it at all when you add or remove support for a language. Hopefully adding support for multiple languages to your website will allow you to begin expanding your business into places you never could have imagined earlier.
The post How to Set Up a Website and Database to Support Multiple Languages appeared first on Matthew Gove Blog.
]]>The post Rebrandings and Fresh Starts appeared first on Matthew Gove Blog.
]]>This past year has been very weird in terms of my career. While I have been able to accomplish a lot of the goals I had set for myself, I also suffered some significant setbacks, too. While the setbacks I experienced were largely out of my control, this rebranding will allow me to have much better control over what happens for me, and will ultimately let me accomplish more goals, as well as leave me much better prepared in the event of future setbacks or bumps in the road.
This blog is the primary addition to my new brand. While I have had a blog for over 10 years now, it has for the most part been just an afterthought, and unfortunately, has been rather neglected for much of the past several years. However, that all changes now. My intention is for the blog to serve as the link that unites both websites, and much of the rest of my online presence. Older projects that relied solely on photos and videos will now be a much more significant part of the blog. My goal is to be able to balance the knowledge I share with you between my personal and professional adventures, including web development, GIS, data solutions, photography, and travel, but we’ll see just how well I can pull that off. I am also hoping to be able to link certain aspects of my personal and professional life together in ways I was unable to before, such as integrating GIS technologies with my photography and travel adventures.
In addition, both of my websites have been updated so they much more accurately depict both my personality, my skillsets, and my goals. Between this blog, my two websites, my new bio/about me page, and my various social media accounts, each entity will serve as a specific piece that will make up my entire online brand as a whole, and will ultimately much better represent who I am and what my passions and skillsets are, and will also allow me to much more effectively pass on my knowledge to you.
To new beginnings and fresh starts. Happy adventuring.
All Rebranding links:
Matthew Gove Web Development (Web Development/GIS Portfolio)
Matt Gove Photo (Photography and Travel Adventures)
New Blog (this site)
About Me/Bio Page
The post Rebrandings and Fresh Starts appeared first on Matthew Gove Blog.
]]>