Efficient high-dimensional variational data assimilation with machine-learned reduced-order models
Our paper titled Efficient high-dimensional variational data
assimilation with machine-learned reduced-order models was
just accepted for publication in Geoscientific Model Development
(GMD).
Thanks to Argonne National Laboratory collaborators:
Romit Maulik, Vishwas Rao, Jiali Wang,
Emil Constantinescu, Bethany Lusch,
Prasanna Balaprakash, Ian Foster and Rao Kotamarthi
Plain-language description
A physical system, such as the atmosphere, can be characterized
by our prior knowledge, in the form of a mathematical model, plus
a set of observations from various sensors.
Data assimilation (DA) combines our prior knowledge of the system,
i.e. the mathematical model, with observations to estimate of state
of the system at a desired time.
In this paper, we propose a data assimilation approach that uses
a machine learning emulator to replace the (usually expensive)
mathematical model adopted in 4-dimensional data assimilation
(also referred to as 4D-Var). Our results indicate that
machine-learning-assisted data assimilation is faster than
traditional model-based data-assimilation by 4 orders of
magnitude, allowing computations to be performed on a workstation
rather than a dedicated high-performance computer.
Abstract
Data assimilation (DA) in geophysical sciences remains the cornerstone
of robust forecasts from numerical models. Indeed, DA plays a crucial
role in the quality of numerical weather prediction and is a crucial
building block that has allowed dramatic improvements in weather
forecasting over the past few decades. DA is commonly framed in
a variational setting, where one solves an optimization problem
within a Bayesian formulation using raw model forecasts as a prior
and observations as likelihood. This leads to a DA objective function
that needs to be minimized, where the decision variables are the
initial conditions specified to the model. In traditional DA, the
forward model is numerically and computationally expensive. Here
we replace the forward model with a low-dimensional, data-driven,
and differentiable emulator. Consequently, gradients of our
DA objective function with respect to the decision variables
are obtained rapidly via automatic differentiation. We demonstrate
our approach by performing an emulator-assisted DA forecast
of geopotential height. Our results indicate that emulator-assisted
DA is faster than traditional equation-based DA forecasts by
4 orders of magnitude, allowing computations to be performed
on a workstation rather than a dedicated high-performance computer.
In addition, we describe accuracy benefits of emulator-assisted
DA when compared to simply using the emulator for forecasting
(i.e., without DA). Our overall formulation is denoted AIEADA
(Artificial Intelligence Emulator-Assisted Data Assimilation).