Data Science Archives - Matthew Gove Blog https://blog.matthewgove.com/portfolio/data-science/ Travel the World through Maps, Data, and Photography Fri, 29 Apr 2022 14:22:25 +0000 en-US hourly 1 https://wordpress.org/?v=6.1.5 https://blog.matthewgove.com/wp-content/uploads/2021/03/cropped-android-chrome-512x512-1-32x32.png Data Science Archives - Matthew Gove Blog https://blog.matthewgove.com/portfolio/data-science/ 32 32 USGS Data Series 741 https://blog.matthewgove.com/project/usgs-data-series-741/ https://blog.matthewgove.com/project/usgs-data-series-741/#respond Tue, 26 Apr 2022 16:19:29 +0000 https://blog.matthewgove.com/?post_type=portfolio_item&p=4446 The U.S. Geological Survey (USGS) is a scientific agency that is part of the U.S. Government’s Department of Interior. Its scientists study geology and earth sciences, as well as the natural hazards that threaten it. While contracted to the USGS, I was one of the authors that published a report […]

The post USGS Data Series 741 appeared first on Matthew Gove Blog.

]]>
The U.S. Geological Survey (USGS) is a scientific agency that is part of the U.S. Government’s Department of Interior. Its scientists study geology and earth sciences, as well as the natural hazards that threaten it. While contracted to the USGS, I was one of the authors that published a report of carbon data in the Arctic Ocean.

What They Needed

  • A simple way to analyze and display the complex three-dimensional data both on the website and on a poster
  • A version of the report that could be distributed both on the web and on CD

The Solution

We had several large datasets containing up to 354,000 three-dimensional ocean chemistry measurements. We needed to be able to display this data on a map. Unfortunately, Excel cannot plot data on geographic maps, nor could ArcGIS plot three-dimensional maps at the time.

To solve the problem, I wrote a Mathematica script that consolidated over 50 two-dimensional figures into 4 three-dimensional figures. We plotted those figures on a three-dimensional bathymetric grid of the ocean floor, allowing the entire dataset to be displayed on a single poster. Coupled with the animation of the 3D dataset that I created with Mathematica, our poster was one of the most talked about at the 2010 American Geophysical Union Fall Meeting.

The Publication

Using a template that the USGS provided, I also built them a simple HTML website that hosted everything from the project.

  • Raw Cruise Data
  • XML Metadata
  • GIS Shapefiles
  • Graphs, Figues, and Plots
  • Our published paper

We published the final report in 2013.

The post USGS Data Series 741 appeared first on Matthew Gove Blog.

]]>
https://blog.matthewgove.com/project/usgs-data-series-741/feed/ 0
Meteotsunami Research https://blog.matthewgove.com/project/meteotsunami-research/ https://blog.matthewgove.com/project/meteotsunami-research/#respond Fri, 22 Apr 2022 15:54:21 +0000 https://blog.matthewgove.com/?post_type=portfolio_item&p=4355 Whenever we mention meteotsunamis to someone, we usually get one of two responses. Some are very intrigued by this new term and want to learn more. The others simply look at you like you have three heads. What is a Meteotsunami? A meteotsunami is an atmospherically-induced tsunami caused by rapid […]

The post Meteotsunami Research appeared first on Matthew Gove Blog.

]]>
Whenever we mention meteotsunamis to someone, we usually get one of two responses. Some are very intrigued by this new term and want to learn more. The others simply look at you like you have three heads.

What is a Meteotsunami?

A meteotsunami is an atmospherically-induced tsunami caused by rapid changes in barometric pressure. Because meteotsunamis are independent of tectonic plate activity, they can strike anywhere. However, they’re much smaller earthquake or landslide-induced tsunamis. They’ll never cause the destruction you saw with the 2004 Indian Ocean Tsunami that devastated Indonesia and Thailand. Nor will they come anywhere close to reaching the intensity of the tsunami that caused the 2011 Fukushima nuclear meltdown in Japan.

However, meteotsunamis can cause extensive damage when they come into harbors where people moor their boats. This is exactly what happened when a meteotsunami struck Falmouth, Massachusetts on 13 June, 2013. While it did not affect anything on land, it did a surprising amount of damage to boats in Falmouth Harbor.

Meteotsunamis can cause extensive damage to docked boats, such as these in Woods Hole, Massachusetts

About Our Research

In September, 2013, the U.S. Geological Survey in Woods Hole, Massachusetts reached out to us to help with their research on meteotsunamis. Our extensive knowledge in meteorology plus 20 years of boating experience in Woods Hole proved to be the perfect skillset.

Our tasks for this project were simple.

  1. Compile a database of all known cases of meteotsunamis that stuck the east coast of the United States between 2000 and 2013.
  2. Compile a database of any weather phenomena that may have cause those meteotsunamis. Such phenomena include fronts, squall lines, thunderstorms, hurricanes, tornadoes, and more.
  3. In the database of weather phenomena, record the following.
    1. Speed and compass direction each weather phenomenon moved (estimated from radar data)
    2. Observed wind speed and barometric pressure changes at the nearest NOAA weather observation station.

The End Result

We compiled a database of 223 meteotsunamis that struck the US east coast between 2000 and 2013. The link between meteotsunamis and weather phenomena that caused sudden pressure changes was undeniable. For more information, please read our publication. Despite the lengthy US Federal Government shutdown in October, 2013, we published our research paper three months ahead of schedule in June, 2014.

The post Meteotsunami Research appeared first on Matthew Gove Blog.

]]>
https://blog.matthewgove.com/project/meteotsunami-research/feed/ 0
California Mercury Modeling https://blog.matthewgove.com/project/california-mercury-modeling/ https://blog.matthewgove.com/project/california-mercury-modeling/#respond Fri, 22 Apr 2022 14:58:03 +0000 https://blog.matthewgove.com/?post_type=portfolio_item&p=4348 Access to a clean water supply is a key issue facing many people around the world today. In the United States, the State of California’s water supply is especially critical to maintain because so much of the United States’ domestic produce and agriculture is grown in California. Furthermore, California also […]

The post California Mercury Modeling appeared first on Matthew Gove Blog.

]]>
Access to a clean water supply is a key issue facing many people around the world today. In the United States, the State of California’s water supply is especially critical to maintain because so much of the United States’ domestic produce and agriculture is grown in California. Furthermore, California also needs to ensure that its 39 million residents have access to water.

What They Needed

To ensure a sustainable, clean water supply, California’s government routinely conducts studies of water quality throughout the state. The state government hired us to help model the presence and transportation of mercury through the Yolo Bypass and the San Francisco Bay Delta. This area is so critical because the watershed supplies the heart of the state’s agriculture belt in the Central Valley. It also provides clean water to both the Sacramento and San Francisco Bay Areas.

Scenery in California’s Central Valley, as seen from Interstate 5 – January, 2022

The Solution

The project consisted of several components

  • 3 Mathematical Models
    • Mercury quantity and transport through the Yolo Bypass
    • Mercury quantity and transport through the San Francisco Bay Delta
    • Water flow rates through the Yolo Bypass.
  • 11 Microsoft Access databases containing up to 24 million data points each
  • A GIS Application
  • A Model-Independent Parameter Estimation and Uncertainty Analysis Application

The End Result

We improved the connection between the four components listed above. That improvement boosted the efficiency and accuracy of both the model and scenario development, as well as the model runs. Because we used the output from one model as input for the other, ensuring its accuracy was very important. Indeed, incorrect output from one model wreaked havoc on the results of the other. As a result, we used Python to debug and develop a breakthrough solution to the issue.

By the time the project wrapped up, we had written 7 custom Python packages and over 30 scripts. The Python code improved process efficiencies by as much as 8,000%. As a result, the project finished by its scheduled completion date in August, 2020.

The post California Mercury Modeling appeared first on Matthew Gove Blog.

]]>
https://blog.matthewgove.com/project/california-mercury-modeling/feed/ 0
COVID-19 Dashboard and Map https://blog.matthewgove.com/project/covid-19-dashboard-and-map/ https://blog.matthewgove.com/project/covid-19-dashboard-and-map/#respond Thu, 21 Apr 2022 22:46:25 +0000 https://blog.matthewgove.com/?post_type=portfolio_item&p=4338 When COVID-19 first broke out in China in December, 2019, numerous institutions, including universities, governments, and private companies build applications to track the disease as it rapidly spread around the world. Most of these applications features a map that used circles or dots to track the disease from outbreak to […]

The post COVID-19 Dashboard and Map appeared first on Matthew Gove Blog.

]]>
When COVID-19 first broke out in China in December, 2019, numerous institutions, including universities, governments, and private companies build applications to track the disease as it rapidly spread around the world. Most of these applications features a map that used circles or dots to track the disease from outbreak to epidemic to pandemic. Many also overloaded you with information to the point you didn’t know where to start.

We Identified a Common Problem with Other COVID-19 Dashboards

We quickly grew frustrated while using these applications. As the outbreak rapidly spread through Europe in late February, 2020, the circles on the maps got so big you couldn’t tell which circle was from which country. Once the pandemic started to accelerate out of control in the United States, many of our family, friends, and loved ones wanted to compare the outbreak in the United States to the outbreaks in Iran, Italy, and Spain. You couldn’t do that easily with tools that existed at the time. Nor could you compare time-series data on the state and provincial level on either a graph or map. We soon started getting the same requests from our friends, family, and colleagues in Canada. We knew we had to do something. So we built our own COVID-19 dashboard, model, and map.

In addition to the COVID-19 data at the country level, our dashboard also features state and province data for over 20 countries. The detailed and easy-to-read map also contains a timeline so you can view data on a map for any day since the pandemic started.

Our COVID-19 Model

As a mathematical modeler, the start of the outbreak in the United States and Canada, combined with how little was known about the disease at the time, really piqued my curiosity and interest. Can we accurately model the pandemic using the little bit we know about COVID-19? We started playing around with the Susceptible – Infected – Removed, or SIR, model. Each tweak we made to the model only increased our curiosity. Before we knew it, we had built our own COVID-19 model. It forecasts both new and cumulative cases and deaths for any country in the world. Additionally, it works for states and provinces in 20-plus countries, territories, continents, and more.

We set the following goals for the model’s total case predictions.

  • 2-Week Projections: 65% correct, with 80% of incorrect projections missing by 5,000 cases or less
  • 1-Month Projections: 50% correct, with 50% of incorrect projections missing by 5,000 cases or less

The Model Has Performed Disturbingly Well

Between May, 2020 and February, 2022, the model outperformed all expectations. Each run averaged getting 75 to 90% of its two-week projections correct. At the same time, one-month projections ran a 60 to 70% correct rate. On its hottest streak, it went 6 weeks without getting a single two-week prediction wrong, despite being run twice per week.

Matt’s Risk Index Combines All COVID-19 Risk Parameters into a Single Index…That You Can Model

If creating our own model wasn’t enough innovation, we took it a step further. Back in my storm chasing days, the strategy was to look for where the severe weather parameters best came together. You’d then target that area for the day’s chase. With COVID-19, it was the exact opposite. You once again wanted to look for where the COVID-19 numbers were all the highest, but avoid those areas instead.

Because there were so many different COVID-19 parameters to consider, I asked myself whether it was possible to combine them all into a single index. That index could immediately identify COVID-19 hotspots when you looked at them on a map. As a result, Matt’s Risk Index was born.

Matt’s Risk Index is essentially a weighted average of all those COVID-19 parameters. It makes hot spots stand out on a map like a sore thumb. Because it’s normalized for population, it will work on any geographic scale. And best of all, you can model it with our COVID-19 model.

Matt’s Risk Index for the United States in September, 2021

Matt’s Risk Index Keeps You Safe

Having Matt’s Risk Index proved invaluable when I drove across the United States at the peak of the winter wave in February, 2021, before vaccines were available to the general public. In fact, I developed the risk index specifically for that trip. Being able to evaluate the risk at the county level and adjust your route to avoid hotspots was incredibly powerful.

Fast forward a year. The risk index kept me safe during my three-month trip through the western United States during the winter of 2021 – 2022. Despite the omicron variant exploding to more than 800,000 cases per day and my 10,000 mile trip taking me through 20 states, I did not get catch Covid (or anything else). In fact, nobody that I know who has used Matt’s Risk Index on our COVID-19 Dashboard has come down with Covid.

The post COVID-19 Dashboard and Map appeared first on Matthew Gove Blog.

]]>
https://blog.matthewgove.com/project/covid-19-dashboard-and-map/feed/ 0