Learning from examples is central in science and engineering. Methods which address this problem have proven immense
power in the last decade, thanks to the availability of more data, faster computing hardware and improved algorithms. The impact is seen in numerous areas, including language processing, neuroscience, and epidemiology.
In the next decade, data-driven methods could have a similar impact on physical simulations. For instance, envision algorithms aiding the derivation of predictive and generalisable models of complex physics from data. While both physical modelling and data-driven methods are active independent research areas, relatively little attention has been paid to the intersection of the two.
The overall ambition of the DataSim project is therefore to develop next-generation physical simulation models that exploit modern machine learning techniques and mathematical theory for classical approaches to physical modelling. Our aim is to achieve this goal through the development of new mathematics, algorithms and software. Of particular interest is the stability of the combined models. The secondary objective of the DataSim project is to educate a new generation of researchers working on the intersection of machine learning and scientific computing.
Modern applications in computational science are governed by multi-scale and partially unknown physically processes that are too complex or poorly understood to be explicitly represented by the governing equations or numerical methods. The plummeting cost of sensors, computational power, and data storage in the last decade offers new opportunities for data-driven modelling of such physical systems. However, while both physical modelling and purely data-driven methods are active independent research areas, little attention has been paid to the intersection of the two. In order to enable a shift towards simulation models that are either parametrised or controlled by data-driven algorithms, there is a pressing need for new mathematical tools, new numerical abstractions and new algorithms.
The ambition of DataSim is therefore to design efficient algorithms to enable data-driven simulation described by partial differential equations. Specifically, we will propose simulation models that consist of partial differential equations coupled to machine learning models build using artificial neural networks. We will then design new algorithms for model identification, and adaptive control methods for partially unknown and dynamically changing physical systems. Based on these algorithms, we will develop a general software framework for specifying, evaluating and training such models. The capabilities of our approach will be demonstrated by developing a weather precipitation model from crowd-sourced weather station data.