Region Mapping Archives - Matthew Gove Blog https://blog.matthewgove.com/tag/region-mapping/ Travel the World through Maps, Data, and Photography Mon, 04 Oct 2021 19:15:13 +0000 en-US hourly 1 https://wordpress.org/?v=6.1.5 https://blog.matthewgove.com/wp-content/uploads/2021/03/cropped-android-chrome-512x512-1-32x32.png Region Mapping Archives - Matthew Gove Blog https://blog.matthewgove.com/tag/region-mapping/ 32 32 How to Automate Region Mapping in TerriaJS with 49 Lines of Python https://blog.matthewgove.com/2021/10/15/how-to-automate-region-mapping-in-terriajs-with-49-lines-of-python/ https://blog.matthewgove.com/2021/10/15/how-to-automate-region-mapping-in-terriajs-with-49-lines-of-python/#comments Fri, 15 Oct 2021 16:00:00 +0000 https://blog.matthewgove.com/?p=3334 A little over a month ago, we examined the benefits of using region mapping in your TerriaJS applications. Region mapping allows you to reduce your GIS application’s data usage by over 99%, permitting you to display massive datasets on two and three-dimensional maps that load quickly and are highly responsive. […]

The post How to Automate Region Mapping in TerriaJS with 49 Lines of Python appeared first on Matthew Gove Blog.

]]>
A little over a month ago, we examined the benefits of using region mapping in your TerriaJS applications. Region mapping allows you to reduce your GIS application’s data usage by over 99%, permitting you to display massive datasets on two and three-dimensional maps that load quickly and are highly responsive.

Example region mapping in TerriaJS
Our COVID-19 Dashboard Map Makes Extensive Use of Region Mapping in TerriaJS

Despite the power and benefits region mapping offers, setting it up can be tedious, time consuming, and rife with type-o’s if you attempt to do it manually. This can be particularly frustrating if you have a large number of regions you want to map. Thankfully, there’s a much easier way. With just 49 lines of Python code, you can generate region maps of any size in just seconds, freeing up valuable time for you to focus on more important tasks.

What You’ll Need

You’ll need a few items to get started with your TerriaJS region mapping automation.

  • The ESRI Shapefile or GeoJSON file that you used to generate your Mapbox Vector Tiles (MVTs) for the region mapping
  • The Python script you’ll write in this tutorial
  • A Terminal or Command Prompt

What is a Region Mapping File

In TerriaJS, a region map consists of two JSON files.

  1. The actual region map
  2. The region mapping configuration file

The Python script we’re writing in this tutorial generates the actual region map. The region map tells TerriaJS the order that each polygon appears in the vector tiles. The configuration file instructs TerriaJS which vector tile parameters contain the region’s unique identifier, name, and more. You can easily write a short Python script to generate the configuration. However, unless you have a lot of region maps you’re generating, I find that the configuration is so short, it’s easier to do manually.

Convert all Shapefiles to GeoJSON Before Automating Region Mapping

If you have ESRI Shapefiles, convert them to GeoJSON before automating your region mapping. We do this for two reasons.

  1. Python can read and parse GeoJSON files natively. To parse shapefiles, you’ll need a third-party library such as GeoPandas.
  2. Mapbox’s Tippecanoe program, which generates the Mapbox Vector Tiles we use for region mapping, requires files to be input in GeoJSON format.

The GeoJSON should contain at least three properties for each feature. We covered these properties in detail in our previous article about region mapping, so I won’t repeat them here. You’re welcome to add as many properties as you want, but it should at the bare minimum contain the following three features.

  • A Feature ID, or FID.
  • A unique identifier you’ll use to identify the feature in the CSV data files you load into Terria.
  • The name of the feature

Most of our map tiles actually contain several unique identifiers that can be used. For example, countries actually have two sets of ISO (International Organization for Standardization) standards, plus a United Nations code that can be used to uniquely identify them. We can use any of these three, plus our own unique identifier to map them to our Mapbox Vector Tiles.

CountryISO Alpha 2 CodeISO Alpha 3 CodeUN Numeric Code
AustraliaAUAUS036
BrazilBRBRA076
CanadaCACAN124
FranceFRFRA250
MexicoMXMEX484
New ZealandNZNZL554
United StatesUSUSA840
Unique Country ID Examples

All right, let’s dive into the Python code.

First, Input Python’s Built-In json Library

The real magic of region mapping in TerriaJS is that all of the files are just JSON files, which are compatible with every popular programming language today. As a result, all we need to do the automation is Python’s json library. The json library comes standard with every installation of Python. We just need to import it. We will also be using Python’s os library to ensure the correct paths exist to output our region mapping files.

import json
import os

First, Write a Python Function to Extract the Feature ID from each Feature in the GeoJSON

This is because you need to go down several layers into the GeoJSON and we need this functionality a lot. It’s best practice to avoid string literals in programming, so we’ll use a function to extract the Feature ID from any given feature in the GeoJSON file.

def fid(f):
    return f["properties"]["FID"]

Define Your Input Parameters

We define the GeoJSON file name, the layer name in the TerriaJS region mapping configuration (regionMapping.json), and the property we’ll use as the unique identifier in our region mapping. Even if your MVT tiles have multiple unique identifier, you may only choose one for each region map. You’ll need to generate a second region map if you want to use multiple unique identifiers.

GEOJSON = "world-countries.geojson"
LAYER = "WORLD_COUNTRIES"
MAPPING_ID_PROPERTY = "alpha3code"
OUTPUT_FOLDER = "regionIds"

If you prefer not to have to manually edit the Python script every time you want to change files, you can easily update the Python script to extract this information from the filename or receive it through command line arguments.

Define the Output Path for Your Region Map File

I like to do this right away because it uses the input parameters. This both gets it out of the way so we don’t need to deal with it later and eliminates the need to scroll up and down looking for variable names if we were to define these at the end. We’ll structure our output paths so they can just be copied and pasted into Terria when we’re done. No renaming or reshuffling of files required.

geojson_filename = GEOJSON.replace(".geojson", "")
output_json_filename = "region_map-{}.json".format(geojson_filename)
output_fpath = "{}/{}".format(OUTPUT_FOLDER, output_json_filename)

In this example, the output filepath or our region mapping file will be regionIds/region_map-world-countries.json.

Read the GeoJSON into Python

Because GeoJSONs are just a specific type of JSON file, we can use Python’s built in json module to read and parse the GeoJSON file.

with open(GEOJSON, "r") as infile:
    raw_json = json.load(infile)

Sort the GeoJSON features by Feature ID

When you convert the GeoJSON to MVT tiles, Tippecanoe sorts the GeoJSON by Feature ID. For the region mapping to work correctly, the feature in the region map must appear in the exact same order as they appear in the Mapbox Vector Tiles. If they’re not in the exact same order, your data will be mapped to the wrong features in TerriaJS. What’s particularly insidious about this issue, is that it will appear as if your data is mapped to the correct county, even though it’s not.

When I first tried to set up region mapping for the World Countries vector tiles, I did not have the features in the right order. Data for the United States was mapped to Ukraine and labeled as Ukraine’s data. France’s data showed up in Fiji, and New Zealand’s data appeared to be Nepal’s. And that’s just to name a few. Except for a few countries at the top of the alphabet that start with “A”, this issue plagued every country. Once I sorted the GeoJSON features by Feature ID, the problem magically went away.

raw_features = raw_json["features"]
features = sorted(raw_features, key=fid)

Note here that we use the fid function we defined above to identify the Feature ID in the sorted() function.

TerriaJS Region Maps are just Arrays of Sorted Feature ID’s

To generate the region map for TerriaJS, all we need to do is just loop through the sorted features and create an array of the unique ID for each feature. Remember that the unique ID is different from the Feature ID. For world countries, TerriaJS comes with the Alpha 2 Codes built into it. For this tutorial, we’ll use the ISO Alpha 3 code as our unique identifier, which we defined in the MAPPING_ID_PROPERTY variable in the use input. If you’ve forgotten, the alpha 3 code is just a 3-letter code that identifies each country. For example, use “USA” for the United States, “CAN” for Canada, “JPN” for Japan, and so forth.

We’ll generate the region map with a simple for loop.

region_mapping_ids = []

for f in features:
    properties = f["properties"]
    region_map_id = properties[MAPPING_ID_PROPERTY]
    region_mapping_ids.append(region_map_id)

Assemble the full TerriaJS Region Mapping JSON in a Python Dictionary

TerrisJS region mapping files are breathtakingly simple. They require three parameters.

  • layer: The name of the layer in the TerriaJS Region Mapping configuration (regionMapping.json). We defined it above with the LAYER variable in the user input.
  • property: The name of the unique identifier in your vector tiles that you’ll use in the CSV files you load into Terria. For the world countries, we’re using the alpha2code identifier. We defined this in the user input using the MAPPING_ID_PROPERTY variable.
  • values: This is the the array of sorted feature ID’s we created in the previous section

It’s important to note that Python does not actually support JSON format. Instead, when you read a JSON file into Python, it actually converts the JSON into a Python dictionary. When we export the dictionary to JSON format, Python just simply does the conversion in the opposite direction.

Anyway, the TerriaJS region mapping JSON should look like this as a Python dictionary.

output_json = {
    "layer": LAYER,
    "property": MAPPING_ID_PROPERTY,
    "values": region_mapping_ids,
}

Also, don’t forget to create the output directory if it doesn’t exist. Your Python script will crash without it.

if not os.path.isdir(OUTPUT_FOLDER):
    os.makedirs(OUTPUT_FOLDER)

Finally, Write the TerriaJS Region Mapping JSON to a .json File

Thanks to Python’s json module, we can output the region mapping file with just a couple lines of code.

with open(output_fpath, "w") as ofile:
    json.dump(output_json, ofile, indent=4)

If you did everything correctly, the region mapping JSON you just created should look like this.

{
    "layer": "WORLD_COUNTRIES",
    "property": "alpha3code",
    "values": [
        "AFG",
        "ALB",
        "DZA",
        "AND",
        "AGO",
        "ATG",
        ...
    ]
}

But we’re not quite done, yet!

Add Your New TerriaJS Region Mapping to the Configuration File

The final step is to add your new region map to the TerriaJS Region Mapping configuration file. By default, the configuration file is located at data/regionMapping.json. However, if you can’t find it, it’s defined in the config.json file at the root of the Terria application. You are more than welcome to automate this, too, but I find that it’s so simple, it’s often easier to just do manually.

The configuration file instructs TerriaJS how to interpret each region map. You’ll need to include several parameters.

ParameterDescription
layerNameThe name of the layer in the Mapbox Vector Tiles. If you can’t remember it, check your vector tiles’ metadata.
serverThe URL from where your Mapbox Vector Tiles are served
serverTypeThe type of files the server is serving. For this tutorial, use MVT, which stands for “Mapbox Vector Tiles”.
serverMinZoomThe minimum zoom level of the vector tiles
serverMaxNativeZoomThe maximum zoom level the vector tiles support natively
serverMaxZoomThe maximum zoom level the vector tiles support
regionIdsFileThe path to the TerriaJS Region Mapping JSON we created in this tutorial
regionPropThe name of the property in the vector tiles that contains the unique identifier we’re using in the region map.
aliasesAny column headers in the CSV data file that TerriaJS should interpret as the unique identifier of your region map
descriptionA description of the region map
bboxThe bounding box of your vector tiles, in the format [west, south, east, north]
namePropThe name of the property in the vector tiles that contains the name of each feature

Using the layer we defined in the LAYER variable as the JSON key, your regionMapping.json configuration file should look like the following.

{
    "regionWmsMap": {
        ...
        "WORLD_COUNTRIES": {
            "layerName": "world-countries",
            "server": "https://yourvectortileserver.com/worldcountries/{z}/{x}/{y}.pbf,
            "serverType": "MVT",
            "serverMinZoom": 0,
            "serverMaxNativeZoom": 3,
            "serverMaxZoom": 8,
            "regionIdsFile": "data/regionIds/region_map-world-countries.json",
            "regionProp": "alpha3code",
            "aliases": ["alpha3code", "country_id", "COUNTRY_ID"],
            "description": "World Countries",
            "bbox": [
                -179.99999999999999999999999,
                -85,
                179.999999999999999999999999,
                85
            ],
            "nameProp": "name"
        },
        ...
    }
}

Create a Dummy CSV File to Test it Out

The funnest part of any project is seeing it all come to life once you’re done. To do this, create a dummy CSV file to test that your region mapping actually works. Pick a bunch of countries at random, assign them some random values as a dummy parameter, and see if they show up on the map.

alpha3codeValue
USA7
CAN4
DEU15
THA12
AUS9
BRA3

To load the file into TerriaJS, just click on the upload button at the top of the workbench, next to the “Browse Data” or “Explore Map Data” button. If the region mapping is working properly, you should see your data appear appear on the map.

Our dummy CSV data displayed on a choropleth world map
The Dummy CSV Data on a World Map

Conclusion

Region mapping is one of the most powerful and efficient ways to display enormous amounts of data in TerriaJS. Automating the region mapping process only saves you valuable time and makes your application even more powerful.

While manually region mapping feature sets such as the 50 US states or the roughly 200 world countries may initially seem manageable, it rapidly becomes a nightmare once you try to scale it up. Sticking just within the United States, what if instead of the 50 states, you were mapping America’s more than 3,000 counties? Our COVID-19 Dashboard map does just that. Or even worse, the United States has over 41,000 postal codes and more than 73,000 census tracts. Can you imagine having to assemble those region maps manually, or the opportunity for type-o’s manually entering tens of thousands of data points?

Instead, save yourself the time, money, and hassle. We’ve made the Python script available for free on Bitbucket so you can configure region maps of any size in TerriaJS in just seconds. And if you ever run into issues, we’re always here to help you with any questions you may have. Happy mapping!

Top Photo: Snow-Capped Sierra Nevada Provide a Spectacular Backdrop for Lake Tahoe’s Brilliant Turquoise Waters
Meeks Bay, California – February, 2020

The post How to Automate Region Mapping in TerriaJS with 49 Lines of Python appeared first on Matthew Gove Blog.

]]>
https://blog.matthewgove.com/2021/10/15/how-to-automate-region-mapping-in-terriajs-with-49-lines-of-python/feed/ 1
Python Tutorial: How to Create a Choropleth Map Using Region Mapping https://blog.matthewgove.com/2021/07/23/python-tutorial-how-to-create-a-choropleth-map-using-region-mapping/ Fri, 23 Jul 2021 16:00:00 +0000 https://blog.matthewgove.com/?p=2567 Several weeks ago, you learned how to create stunning maps without a GIS program. You created a map of a hurricane’s cone of uncertainty using Python’s GeoPandas library and an ESRI Shapefile. Then you created a map of major tornadoes to strike various parts of the United States during the […]

The post Python Tutorial: How to Create a Choropleth Map Using Region Mapping appeared first on Matthew Gove Blog.

]]>
Several weeks ago, you learned how to create stunning maps without a GIS program. You created a map of a hurricane’s cone of uncertainty using Python’s GeoPandas library and an ESRI Shapefile. Then you created a map of major tornadoes to strike various parts of the United States during the 2011 tornado season. You also generated two bar charts directly from the shapefile to analyze the number of tornadoes that occurred in each state that year. However, we did not cover one popular type of map: the choropleth map.

2011 tornado paths across the southeastern United States, created with Python GeoPandas.
2011 Tornado Tracks Across Dixie Alley

Today, we’re going to take our analysis to the next level. You’ll be given a table of COVID-19 data for each US State in CSV format for a single day during the COVID-19 pandemic. The CSV file has the state abbreviations, but does not include any geometry. Instead, you’ll be given a GeoJSON file that contains the state boundaries. You’ll link the data to the state boundaries through a process called region mapping and create a choropleth map of the data.

Why Do We Use Region Mapping to Create Choropleth Maps?

The main reason we use region mapping is for performance. When you use region mapping, you only need to load your geometry once, regardless of how many data points use that geometry. Each data point uses a unique identifier to “map” it to the geometry. You can use the ISO state or country codes, or you can make your own ID’s. Without region mapping, you need to load the geometry for each data point that uses it.

To show you the performance gains, let’s use COVID-19 data as an example. In our COVID-19 Dashboard’s Map, you can plot data by state for several countries. For Canada, the GeoJSON file that contains the provincial boundaries is 150 MB. We’re roughly 500 days into the COVID-19 pandemic. A quick back-of-the-envelope calculation shows just how much data you’d need to load without region mapping.

data_load_size = size_of_geojson * number_of_days
data_load_size = (150 MB) * (500 days)
data_load_size = 75,000 MB = 75 GB 

Keep in mind, that 75 GB is just for the provincial boundaries. It does not include any of the COVID-19 data. And it only grows bigger and bigger every day.

Region Mapping helps us efficiently load data into our COVID-19 dashboard.
Region Mapping and Vector Tiles Allow Us to Load Canada’s Provincial Boundaries into our COVID-19 Map using Less Than 2 MB of Data.

Using region mapping, you only need to load the provincial boundaries once. With the GeoJSON file, that’s only 150 MB. In our COVID-19 map, we actually take it a step further. Instead of GeoJSON format, we use Mapbox Vector Tiles (MVT), which is much more efficient for online maps. The MVT geometry for the Canadian provincial boundaries is only 2 MB. Compared to possibly 75 GB of geometry data, 2 MB wins hands down.

What is a Choropleth Map?

A choropleth map displays statistical data on a map using shading patterns on predetermined geographical areas. Those geographic areas are almost always political boundaries, such country, state, or county borders. They work great for representing variability of a given measurement across a region.

Choropleth Map of Worldwide COVID-19 data
A Sample Choropleth Map Showing New Daily Worldwide COVID-19 Cases on 14 July, 2021

An Overview of Creating a Choropleth Map in Python GeoPandas

The process we’ll be programming in our Python script is breathtakingly simple using GeoPandas.

  1. Read in the US State Boundaries from the GeoJSON file.
  2. Import the COVID-19 data from the CSV file.
  3. Link the data to the state boundaries using the ISO 3166-2 code (state abbreviations)
  4. Plot the data on a choropleth map.

Required Python Dependencies

Before we get started, you’ll need to install four Python modules. You can easily install them using either anaconda or pip. If you have already installed them, you can skip this step.

  • geopandas
  • pandas
  • matplotlib
  • contextily

The first item in our Python script is to import those four dependencies.

import geopandas
import pandas
import matplotlib.pyplot as plt
import contextily as ctx

Define A Few Constants That We’ll Use Throughout Our Python Script

There are a few values we’ll use throughout the script. Let’s define a few constants so we can easily reference them.

GEOJSON_FILE = "USStates.geojson"
CSV_FILE = "usa-covid-20210102.csv"

# 3857 - Mercator Projection
XBOUNDS = (-1.42e7, -0.72e7)
YBOUNDS = (0.26e7, 0.66e7)

The XBOUNDS and YBOUNDS constants define the bounding box for the map, in the x and y coordinates of the Mercator projections, which we’ll be using in this tutorial. They are not in latitude and longitude. We’ve set them so the left edge of the map is just off the west coast (~127°W) and the right edge is just off the east coast (~65°W). Likewise, the top of the map is just above the US-Canada border (~51°N), and the bottom edge is far enough south (~23°N) to include Florida peninsula and the Keys.

Read in the US State Boundaries Using GeoPandas

GeoPandas is smart enough to be able to automatically figure out the file format of most geometry files, including ESRI Shapefiles and GeoJSON files. As a result, we can load the GeoJSON the exact same way as we loaded the ESRI Shapefiles in previous tutorials.

geojson = geopandas.read_file(GEOJSON_FILE)

Read in Data From the CSV File

You may have noticed that we did not import Python’s built in csv module. That was done intentionally. Instead, we’ll use Pandas to read the CSV.

On the surface, it may look like the main benefit is that you only need a single line of code to read in the CSV data with Pandas. After all, it takes a block of code to do the same with Python’s standard csv library. However, you’ll really reap the benefits in the next step when we go to map the data to the state boundaries.

data = pandas.read_csv(CSV_FILE)

Map the CSV Data to the State Boundaries in the GeoJSON File

When you read the GeoJSON file in with the geopandas.read_file() method, Python stores it as a Pandas DataFrame object. If you were to read in the CSV data using Python’s built-in csv library, Python would store the data as a csv.reader object.

Here’s where the magic happens. By reading in the CSV data with Pandas instead of the built-in csv library, Python also stores the CSV data as a Pandas DataFrame object. If we has used Python’s built-in csv library, mapping the CSV data to the state boundaries would be like trying to combine two recipes, where one was in imperial units, and the other was in metric units.

The Pandas developers built the DataFrame objects to be easily split, merged, and manipulated, which means that once again, we can do it with just a single line of code.

full_dataset = geojson.merge(data, left_on="STATE_ID", right_on="iso3166_2")

Let’s go over what that line of code means.

  • geojson.merge(data, ... ): Merge the CSV data store in the data variable into the US State boundaries stored in the geojson variable.
  • left_on="STATE_ID": The property that contains the common unique identifier in the GeoJSON file is called STATE_ID.
  • right_on="iso3166_2": The property (column) that contains the corresponding unique identifier in the CSV data is called iso3166_2.

The ISO 3166-2 Code: What’s in the Mapping Identifier?

In this tutorial, we’re using each state’s unique ISO 3166-2 code to map the CSV data to the state boundaries in the GeoJSON. So what exactly is an ISO 3166-2 code? It’s a unique code that contains the country code and a unique ID for each state. The International Organization for Standardization, or ISO, maintains a standardized set of codes that every country in the world uses.

In many countries, including the United States and Canada, the ISO 3166-2 codes use the same state and province abbreviations that their respective postal services use. As you’ll see in the table, though, not all countries do.

ISO 3166-2 CodeState/ProvinceCountry
US-CACaliforniaUnited States
US-FLFloridaUnited States
US-NYNew YorkUnited States
US-TXTexasUnited States
CA-BCBritish ColumbiaCanada
CA-ONOntarioCanada
AU-NSWNew South WalesAustralia
AU-WAWestern AustraliaAustralia
ZA-MPMpumalangaSouth Africa
IT-BOBolognaItaly
RU-CHEChelyabinskaya OblastRussia
IN-MHMaharashtraIndia
TH-50Chaing MaiThailand
JP-34HiroshimaJapan
FR-13Bouches-du-RhôneFrance
AR-XCórdobaArgentina
KG-CChuyKyrgyzstan
Sample ISO 3166-2 Codes from Various Countries

Write a Function to Generate a Choropleth Map

Once the CSV data has been successfully linked to the state boundaries in the GeoJSON, everything is stored in a single Pandas DataFrame object. As a result, the code to plot the data will be nearly identical to the maps we created in previous GeoPandas tutorials.

Like the tornado track tutorial, you’ll be creating several different maps. To avoid running afoul of the DRY (Don’t Repeat Yourself) principle, let’s put the plotting code into a function that we can call.

First, let’s define the function. We’ll pass it X parameters.

def choropleth_map(mapped_dataset, column, plot_type):

Initialize the Figure

Inside that function, let’s first initialize the figure that will hold our choropleth map.

ax = mapped_dataset.plot(figsize=(12,6), column=column, alpha=0.75, legend=True, cmap="YlGnBu", edgecolor="k"

There’s a lot in this step, so let’s unpack it.

  • figsize=(12,6): Plot should be 12 inches wide by 6 inches tall
  • column=column: Plot the column name that was passed to the choropleth_map() function.
  • alpha=0.75: Make the map 75% opaque (25% transparent) so you can see through it slightly.
  • legend=True: Include the color bar legend on the figure
  • cmap="YlGnBu": Use a Yellow-Green-Blue color map
  • edgecolor="k": Color the state outlines/borders black

Remove Axis Ticks and Labels From Your Choropleth Map

If we use the standard WGS-84 (EPSG:4326) projection to plot the continental US, the map comes out short and wide. For a better aspect ratio, we’ll convert the data into the Mercator Projection (EPSG:3857). Unfortunately, that means the x and y axes will no longer be in latitude and longitude, and will instead be in the coordinates of the Mercator Projection. To avoid any confusion, let’s just hide the labels on the x and y axes.

ax.set_xticks([])
ax.set_yticks([])

Set the Title of Your Choropleth Map

Next, we’ll set the title, exactly like we’ve done in previous tutorials.

title = "COVID-19: {} in the United States\n2 January, 2021".format(title)
ax.set_title(plot_type)

Because we’re only working with a specific date, we’ve hard-coded the date into the function. However, if you’re working with multiple dates, you can easily update the code so that the correct dates display on the maps.

Zoom the Map to Show the Continental United States

Now, let’s set the bounding box to show only the Lower 48.

ax.set_xlim(*XBOUNDS)
ax.set_ylim(*YBOUNDS)

Add the Basemap For Your Choropleth Map

Penultimately, add the basemap for the choropleth map. We’ll use the same Stamen TonerLite basemap that we used in both the Hurricane Dorian Cone of Uncertainty and the maps of the 2011 tornado tracks. We’ll get the projection from the dataset so we don’t have to worry about the basemap and the data being in different projections.

ctx.add_basemap(ax, crs=full_dataset.crs.to_string(), source=ctx.provicers.Stamen.TonerLite, zoom=4)

Save Your Choropleth Map to a png File

Finally, save the plots to a png image file.

output_path = "covid19_{}_usa.png"
plt.savefig(output_path)

Let’s Generate 4 Choropleth Maps

Now that we have our function to generate the choropleth maps, let’s make 4 maps of COVID-19 data on 2 January, 2021, which was the peak of the winter wave in the United States.

  • New Daily Cases
  • Total Cumulative Cases
  • New Daily Deaths
  • Total Cumulative Deaths
columns_to_plot = [
    "new_cases", 
    "confirmed",
    "new_deaths",
    "dead"
]

plot_types = [
    "New Daily Cases",
    "Total Cumulative Cases",
    "New Daily Deaths",
    "Total Cumulative Deaths",
]

for column, plot_type in zip(columns, plot_types):
    choropleth_map(full_dataset, columns, plot_type)
    print("Successfully Generated Choropleth Map for {}...".format(plot_type))

Let There Be Maps

After running the script, you’ll find 4 choropleth maps in the script directory.

Download the Script and Run It Yourself

We encourage you to download the script from our Bitbucket Repository and run it yourself. Play around with it and see what other kinds of choropleth maps you can come up with.

Conclusion

Region mapping is an incredibly powerful way to efficiently display massive amounts of data on a map. For example, when we load the Canadian provincial data in our COVID-19 map, the combination of region mapping plus the Mapbox Vector tiles has resulted in a 99.997% reduction in the size of the provincial boundary being loaded. These savings are critical to the success of online GIS projects. Nobody in their right mind is going to sit around and wait for 75 GB of state boundaries to download every time the map loads.

Many people think that high-level tasks such as region mapping are confined to tools like ESRI ArcGIS. While Python GeoPandas is certainly not a replacement for a tool like ArcGIS, it’s a perfect solution for organizations that don’t have the budget for expensive software licenses or don’t do enough GIS work to require those licenses. If you’re ready, we can help you get started building maps with GeoPandas today.

If you’re ready to try a few exercises yourself, we’ve got a couple challenges for you.

Next Steps, Challenge 1:

Revisit our tutorial plotting 2011 tornado data. Revise that script so that instead of generating a map of the tornado tracks, you create a choropleth map of the number of tornadoes to strike each state in 2011. I’ll give you a hint to get started. You don’t need to use region mapping for this because the data is already embedded in the shapefile.

Next Steps, Challenge 2:

In the Bitbucket Repository, you’ll find a CSV File of COVID-19 data for each country for 2 January, 2021. Go online and find a GeoJSON or ESRI Shapefile of world country borders. Then use region mapping to create the same 4 choropleth maps we generated in this tutorial, except you should output a map of the world countries, not a map of US States. I’ve included all of the ISO Country Codes in the CSV file so you can use the Alpha 2, Alpha 3, or Numeric codes.

Top Photo: Beautiful Geology in Red Rock Country
Sedona, Arizona – August, 2016

The post Python Tutorial: How to Create a Choropleth Map Using Region Mapping appeared first on Matthew Gove Blog.

]]>