Aardvark Weather: end-to-end data-driven weather forecasting

“Aardvark Weather: end-to-end data-driven weather forecasting,” by Anna Vaughan, Richard Turner and colleagues at the University of Cambridge and the Alan Turing Institute, with the European Centre for Medium-Range Weather Forecasts, was posted to arXiv in 2024 and later published in Nature. It addressed a limitation of earlier ML weather models: systems like GraphCast and Pangu-Weather still depend on a conventional data-assimilation pipeline to turn raw sensor readings into the gridded initial conditions they consume.

Aardvark instead learns the whole chain. It takes raw observations from satellites, weather stations, balloons and other sources directly, and produces both global gridded forecasts and local station forecasts, without the traditional numerical-weather-prediction machinery in between. The paper reports that Aardvark’s global forecasts beat operational baselines on several variables and lead times, that its local forecasts are skillful out to about ten days, and that it does this using roughly 8 percent of the input data and three orders of magnitude less compute than the conventional approach.

The result matters because the data-assimilation step it replaces is itself a major cost and complexity of operational forecasting, and because a fully learned pipeline could make accurate forecasting far cheaper and more accessible, including in regions that cannot run their own supercomputer-based systems.

Sources

Last verified June 7, 2026