Time series are everywhere. Data recorded from sensors in mobile phones, financial data like accounting figures, and climate indicators are all examples of time series society and individuals are exposed to daily. Understanding such time series is essential for technological advances and making informed decisions.
Many of these time series are irregular in some sense. They may have missing data, which may occur if sensors fail, if a person forgets to insert a number in a spreadsheet, or if the phenomenon we are interested in may only be observed at certain points in time. They may also be very noisy: for example, using cheap sensors may allow us to get data from more sensors at the expense of the measurement having more noise than when using a more expensive sensor.
The project Machine Learning for Irregular Time Series (ML4ITS) addresses some core challenges for irregular time series. In particular, the project will develop methodology that handles irregular time series for the following tasks:
- Forecasting: predicting the future values of the time series based on current/past data.
- Representation learning: learning representations of time series that are useful for several downstream tasks
- Imputation/denoising: creating ?clean? data in the scenario there is missing or noisy data
- Anomaly detection and failure prediction: knowing which observations are unusual or indicating that a system is in a critical state.
- Synthetic data creation.
The last point addresses the need for creating datasets that are privacy-preserving. For example, the sensor data on a cell phone may not be anonymous, but it may be possible to create a synthetic dataset that behaves like the original data in a statistical sense that at the same time preserves privacy. Furthermore, the project aims to make reproducible research and develop open-source software that will benefit the research ecosystem.
The project is a collaboration between Sintef Digital and three departments at NTNU: Department of Computer Science, Department of Mathematics, and Department of Electronic Systems.
Time series data are ubiquitous. The broad diffusion and adoption of Internet of Things, and major advances in sensor technology are examples of why such data have become pervasive. These technologies have applications in several domains, such as healthcare, finance, meteorology and transportation, for solving related tasks on time series. Deep Neural Networks have recently been used to create models that improve on the state of the art for some of these tasks.
In some scenarios obtaining a training set that matches the feature space and predicted data distribution characteristics of the test set can be time consuming, difficult and expensive. Thus, there is a need for focusing on modern AI techniques that can extract value from small and irregular data. These considerations can also contribute to conform with the increasing need to address the sustainability and privacy aspects of ML and AI.
The goal of this project is to overcome the issue of limited available or labelled data in the time series domain, where the heterogeneity of the data, e.g. non-stationarity, multi-resolution, irregular sampling, poses a further challenge. ML4ITS's main objective is to advance the state-of-the-art in time series analysis for 'irregular' time series data (see explicit definition of 'irregular time series' in the proposal). We plan to achieve these goals and tackling the 'robustness' challenge, by developing novel A) Transfer Learning and B) Unsupervised learning and Data Augmentation methods. These techniques have hardly been explored in the time series domain.
We relies on a multidisciplinary study combining different perspectives from the three main scientific communities involved in time series analysis. As a result, the consortium has been composed with this complementarity in mind including researchers across these different fields (IES, MATH, IDI).