How to Create Your Own AI Weather Forecast

Author:Murphy | View: 22184 | Time: 2025-03-23 12:31:54

Data-driven weather forecasts with pre-trained models are inexpensive to create and can provide forecasts with comparable accuracy to established numerical weather models. Several companies and research labs have developed AI weather models, including:

PanguWeather [Huawei]
FourCastNet [NVIDIA]
GraphCast [Google DeepMind]

The European Centre for Medium-Range Weather Forecasts (ECMWF) provides routines for generating weather forecasts with these models [1]. The model inference can be performed even on a laptop, although the use of a GPU is recommended.

In this post, I am going to

Show you how to create your own AI weather forecast
Compare the inference times for the models on GPU and on CPU
Visualize the weather forecast for PanguWeather and FourCastNet, including temperature, water vapor, and the jet stream

Background information

Traditionally, weather forecasting relies on numerical weather models that are solved on a global grid. This requires large computing resources, and only few weather services in the world are equipped to generate global weather forecasts.

AI weather models are trained on the weather of the past, called reanalysis data. With the trained model and the current weather state, weather forecasts can be generated using only a fraction of the computational resources required by a numerical weather forecast. The schematic shows how AI weather forecasting predicts future weather using current weather as input:

AI weather forecast schematic. Left: Wind field at 00:00 UTC, 01/01/2021. Right: Wind field at 23:00 UTC, 01/01/2021. Data: ERA5 via Copernicus Climate Data Store [3]. Image: Author.

All three AI weather models covered in this article provide weather forecasts with a spatial resolution of 0.25° (25 kilometers) and a temporal resolution of 6 hours. For more background on the impact of AI in weather forecasting, see my previous article:

The AI revolution in weather prediction

Getting started

Code and environmentGet the code to run the pre-trained weather models from the ECMWF Github page [1] and follow the installation instructions. I used v0.2.5 for this tutorial. Set up a conda environment for the required python packages.

conda create -n ai-models python=3.10
conda activate ai-models
conda install cudatoolkit
pip install ai-models

To install the available AI models, use

pip install ai-models-panguweather
pip install ai-models-fourcastnet

GraphCast requires a special installation routine that is covered below.

Pre-trained model weightsPre-trained model weights for PanguWeather, GraphCast, and FourCastNet need to be downloaded:

ai-models --download-assets --assets assets-panguweather panguweather
ai-models --download-assets --assets assets-fourcastnet  fourcastnet
ai-models --download-assets --assets assets-graphcast    graphcast

Initialization dataThe AI weather models need to be initialized with current weather data in order to make predictions. There are two ways to get the data, either from the ECMWF service MARS, or from the Copernicus Climate Date Store (CDS). Both services provide the initialization data for free for non-commercial use.

I am using reanalysis data from the CDS for the purpose of this tutorial. Follow the instructions on their website to obtain an API key to access the data [2]. Note that reanalysis data is made available with a delay of about five days.

Creating the AI weather forecast

We are now ready to generate our very own AI weather forecast. All that remains is to select a time and date for the initialization, and start model inference with the pre-trained model.

PanguWeatherTo request a PanguWeather forecast starting on 09/20/2023, 00:00 UTC based on CDS initialization data, use:

ai-models --input cds --date 20230920 --time 0000 --assets assets-panguweather panguweather

The logs are fairly comprehensive. The pre-trained model pangu_weather_6.onnx produces a weather forecast 6 hours ahead, and here it is applied iteratively 40 times to reach a total of 240 hours (10 days).

2023-09-26 12:06:45,724 INFO Loading assets-panguweather/pangu_weather_24.onnx: 26 seconds.
2023-09-26 12:07:10,289 INFO Loading assets-panguweather/pangu_weather_6.onnx: 24 seconds.
2023-09-26 12:07:10,289 INFO Model initialisation: 52 seconds
2023-09-26 12:07:10,290 INFO Starting inference for 40 steps (240h).

For each iteration, the logs tell the user how much time was spent. In my experiment, the ONNX runtime could not find the GPU, hence inference was done on CPU and took about 3:15 minutes per weather prediction step.

2023-09-26 12:10:27,260 INFO Done 1 out of 40 in 3 minutes 16 seconds (6h), ETA: 2 hours 11 minutes 18 seconds.

FourCastNetIn order to generate a forecast with FourCastNet for the same initialization time, at the same time downloading the trained model weights to the sub-directory assets-fourcastnet , we use:

ai-models --download-assets --assets assets-fourcastnet --input cds --date 20230920 --time 0000 fourcastnet

Again, the logs give us comprehensive information on the model inference. Now, the ONNX runtime was able to find the GPU, and inference was much faster, taking only 2 minutes to generate the 10-day forecast.

2023-09-26 13:23:48,633 INFO Using device 'CUDA'. The speed of inference depends greatly on the device.
2023-09-26 13:23:59,710 INFO Loading assets-fourcastnet/backbone.ckpt: 13 seconds.
2023-09-26 13:24:10,673 INFO Loading assets-fourcastnet/precip.ckpt: 10 seconds.
2023-09-26 13:24:10,733 INFO Model initialisation: 46 seconds
2023-09-26 13:24:10,733 INFO Starting inference for 40 steps (240h).
2023-09-26 13:24:14,247 INFO Done 1 out of 40 in 3 seconds (6h), ETA: 2 minutes 20 seconds.

GraphCastFor GraphCast, we first need to follow the installation steps in https://github.com/ecmwf-lab/ai-models-graphcast. Then, we can generate the weather forecast using:

ai-models --download-assets --assets assets-graphcast --input cds --date 20230920 --time 0000 graphcast

At the time of writing, this resulted in a bug that is open on GitHub (Issue #10 in [1]), and I was unable to complete a forecast with GraphCast.

Forecast output files

The AI weather models produce output files in GRIB format. For a 240 hour forecast, the file size is

2.3 GB with FourCastNet (3 pressure levels: 850 hPa, 500 hPa, 250 hPa)
5.4 GB with PanguWeather (13 pressure levels)

Both models output key meteorological variables:

Temperature
Wind speed and direction

PanguWeather also provides geopotential and humidity, while FourCastNet provides water vapor content.

Generate plots

Proper evaluation and benchmarking of these AI-generated forecasts is a complex topic beyond the scope of this article. Here, we focus on the visual comparison between the two AI weather models. To visually analyze the weather forecasts, we select the temperature variable and the forecast 24 hours after initialization. This is the template for generating the plots:

import xarray as xr
import seaborn as sns
from matplotlib import pyplot as plt
import cartopy.crs as ccrs

# load data
ds_fourcast = xr.open_dataset('fourcastnet.grib')

ix_step = 3  # 0 = first step in weather prediction
ix_level = 0 # 0 = surface pressure level
temp_degC = ds_fourcast['t'][ix_step, ix_level] - 273.15 # subtract 273.15 to convert from Kelvin to Celsius.

# plot settings
sns.set_style('ticks')
tmin = -45
tmax = +45
nlev = 19

# generate figure
fig = plt.figure()
ax = fig.add_subplot(1, 1, 1, projection=ccrs.PlateCarree())
ax.set_global()
ax.coastlines('110m', alpha=0.1)

# contour plot
img = ax.contourf(ds_fourcast.longitude.values, ds_fourcast.latitude.values, 
                  temp_degC.values, 
                  levels = np.linspace(tmin, tmax, nlev),
                  cmap='RdBu_r',
                  transform=ccrs.PlateCarree()
                 )

# hours passed since initialization
dt_hours = ((ds_fourcast['valid_time'][ix_step] - ds_fourcast['time']).values / 1e9 / 3600).astype(float)

# annotation text
ax.text(-180, 90, 
        f'After {dt_hours} hours ({ds_fourcast["valid_time"][ix_step].values})', 
        transform=ccrs.PlateCarree(),
        va='bottom', ha='left',
       )

plt.colorbar(img, orientation='horizontal', label='850 hPa temperature (°C)')
plt.show()

TemperatureThe following figure compares the temperature at the 850 hPa level for PanguWeather and FourCastNet. This level is not directly at the surface, but it is the only level included in both output files. Visually, we see that the temperature fields show similar patterns.

850 hPa temperature after 24 hours. Image: Author.

Water vaporFourCastNet provides the total atmospheric water vapor mass, which can be compared to the amount of clouds carrying rain droplets. In the following figure, we compare this quantity for four different snapshots: after 1 day (24 hours), 3 days (72 hours), 5 days (96 hours), and out to 10 days (240 hours).

Total atmospheric water vapor after 1, 3, 5, and 10 days of forecasting with FourCastNet. Image: Author.

We observe a phenomenon that is well known in Deep Learning: Because the model was trained using root mean squared error (RMSE), it has a tendency to produce smooth fields. Over time, as the model is applied iteratively, the predictions become less pronounced. After 10 days, we even observe artifacts at the southern tip of South America. Note that the AI weather models currently do not use generative AI techniques, as they focus more on physical accuracy than visual appeal.

Wind speedPanguWeather provides wind speed at different pressure levels. We choose the 250 hPa pressure level, which corresponds to the level where the jet stream is most present. The jet stream is a circumpolar air flow that influences the weather in mid-latitude regions such as the USA and Europe. The figure shows the wind speed with a focus on the Southern Hemisphere Jet Stream, which was more pronounced during the time of my forecast.

Southern Hemisphere jet stream forecast with PanguWeather. Image: Author.

To create an orthographic projection and calculate the wind speed, we used the following template:

import xarray as xr
import seaborn as sns
from matplotlib import pyplot as plt
import cartopy.crs as ccrs

# load data
#ds_pangu = xr.open_dataset('/home/k/k202141/rootgit/ai-models/panguweather.grib')

ix_step = 3  # 0 = first step in weather prediction
ix_level = 8 # 0 = surface pressure level, 2 - 850 hPa level
windspeed = (ds_pangu['u'][ix_step, ix_level]**2 + ds_pangu['u'][ix_step, ix_level]**2)**0.5

# plot settings
sns.set_style('ticks')
tmin = 0
tmax = 150
nlev = 16

# generate figure
fig = plt.figure()
ax = fig.add_subplot(1, 1, 1, projection=ccrs.Orthographic(central_latitude=-30))
ax.set_global()
ax.coastlines('110m', alpha=1.0)

# contour plot
img = ax.contourf(ds_pangu.longitude.values, ds_pangu.latitude.values, 
                  windspeed.values, 
                  levels = np.linspace(tmin, tmax, nlev),
                  cmap='gist_ncar',
                  transform=ccrs.PlateCarree()
                 )

# hours passed since initialization
dt_hours = ((ds_pangu['valid_time'][ix_step] - ds_pangu['time']).values / 1e9 / 3600).astype(float)
print(dt_hours)

plt.colorbar(img, orientation='horizontal', label='Wind speed (m/s)', shrink=0.5)
plt.show()

Summary

In this article, we have shown how to use the AI-Models repository provided by ECMWF [1] to generate AI weather forecasts. Using initialization data from the Climate Data Store, we were able to generate forecasts using two models: Huawei's PanguWeather and NVIDIA's FourCastNet. Generating a 10-day weather forecast with a pre-trained model takes about 2 minutes on a GPU and about 3 hours on a CPU. The cost of inference is very low compared to the immense computational effort required to generate a numerical weather forecast.

It is now possible for individual consumers and less well-funded weather services to generate their own AI weather forecasts, at least with the spatial and temporal resolution provided by the ERA5 training data. The ECMWF repository provides easy access to pre-trained models.

References

[1] https://github.com/ecmwf-lab/ai-models (version 0.2.5)
[2] Copernicus Climate Data Store API: https://cds.climate.copernicus.eu/api-how-to
[3] Hersbach, H., et al (2017): Complete ERA5 from 1940: Fifth generation of ECMWF atmospheric reanalyses of the global climate. Copernicus Climate Change Service (C3S) Data Store (CDS). DOI: 10.24381/cds.143582cf (Accessed on 26-SEP-2023)

Tags: Deep Learning Editors Pick Machine Learning Programming Science