Back to search

IKTPLUSS-IKT og digital innovasjon

EuroHPC-prosjekt MAELSTROM, MAchinE Learning for Scalable meTeoROlogy and cliMate

Alternative title: EuroHPC-prosjekt <MAELSTROM, MAchinE Learning for Scalable meTeoROlogy and cliMate>

Awarded: NOK 1.8 mill.

The primary goal of MAELSTROM is to enable weather and climate applications to better exploit machine learning (ML) and improve predictions through optimized compute system design and a software framework that can handle vast amounts of data. MAELSTROM combines European expertise on ML, high performance computing, and weather and climate applications from the Norwegian Meteorological Institute (MET Norway), European Centre for Medium-Range Weather Forecasts, The Jülich Supercomputing Center, 4cast GmBh & Co KG, E4 Computer Engineering SpA, ETH Zürich, and University of Luxembourg. This interdisciplinary team represents weather forecast centers, supercomputer system providers, and ML research communities. MAELSTROM will provide a set of benchmark datasets, each about 10 terabytes in size, for a wide spectrum of potential applications across weather and climate, including observation processing, assimilation of observations into weather models, correction of weather model output, and tailored forecast products. The datasets offer the ML research communities an opportunity to develop, test and validate new ML-methods on relevant and realistic datasets and also allow computer system providers to benchmark the performance of new system configurations and emerging hardware for energy efficiency, time-to-solution, and numerical accuracy. In the first year of the project, MET Norway assembled a benchmark dataset containing short-range numerical weather forecasts for the Nordic countries, and made it accessible to the machine learning community through the project website https://www.maelstrom-eurohpc.eu/. A description of the dataset and a recipe for setting up a simple ML model is also provided on the website. This means that the community can test their own ML models and compare them with the accuracy of the temperature forecasts on MET Norway's weather app Yr (yr.no). Currently, the dataset covers a small geographic region to make it easier to test on a personal laptop. The next step in the project is to scale this up to the whole Nordic region and together with the other international partners develop and train ML models on supercomputers. The goal is that the ML solutions that are developed in the project will be used to improve weather forecasts on Yr.

-

To develop Europe’s computer architecture of the future, MAELSTROM will co-design bespoke compute system designs for optimal application performance and energy efficiency, a software framework to optimise usability and training efficiency for machine learning at scale, and large-scale machine learning applications for the domain of weather and climate science. The MAELSTROM compute system designs will benchmark the applications across a range of computing systems regarding energy consumption, time-to-solution, numerical precision and solution accuracy. Customised compute systems will be designed that are optimised for application needs to strengthen Europe’s high-performance computing portfolio and to pull recent hardware developments, driven by general machine learning applications, toward needs of weather and climate applications. The MAELSTROM software framework will enable scientists to apply and compare machine learning tools and libraries efficiently across a wide range of computer systems. A user interface will link application developers with compute system designers, and automated benchmarking and error detection of machine learning solutions will be performed during the development phase. Tools will be published as open source. The MAELSTROM machine learning applications will cover all important components of the workflow of weather and climate predictions including the processing of observations, the assimilation of observations to generate initial and reference conditions, model simulations, as well as post-processing of model data and the development of forecast products. For each application, benchmark datasets with up to 10 terabytes of data will be published online for training and machine learning tool-developments at the scale of the fastest supercomputers in the world. MAELSTROM machine learning solutions will serve as blueprint for a wide range of machine learning applications on supercomputers in the future.

Funding scheme:

IKTPLUSS-IKT og digital innovasjon