Back to search

IKTPLUSS-IKT og digital innovasjon

TrACEr: Time-Aware ConstrainEd Multimodal Data Fusion

Alternative title: TrACEr: Tidsavhengig fusjon av multimodale data

Awarded: NOK 10.4 mill.

Complex systems exist in many domains and a better understanding of how complex systems behave and evolve over time can affect our ability to address important problems such as understanding brain dynamics or changes in the human metabolome (i.e., the complete set of small biochemical compounds in the body), for instance, to capture early signs of diseases. In order to gain such a level of understanding, the system should be recorded using different sensing technologies. This creates a surge of data, and some of these data sets evolve in time while others are static. How do we analyze dynamic data sets jointly with static data sets from different sources? The goal of this project is to develop novel data mining methods that can jointly analyze static and dynamic data from multiple sources, capture underlying patterns and evolution of those patterns. Our method development efforts are motivated by a challenging system: the human metabolome. We plan to use the developed methods to jointly analyze static data (e.g., genetics, microbiome) and longitudinal metabolomics data to discover interpretable patterns revealing group differences among subjects in terms of how (clusters of) metabolites change in time. We use measurements of samples collected during a meal challenge test from the COPSAC2000 cohort to understand differences among subject groups (e.g., high vs. low BMI (body mass index) groups) in terms of their metabolic response to the challenge test. In addition, we aim to show the broad impact of the developed methods in other domains, in particular, in neuroscience. We have been approaching the problem from several angles: (i) Data fusion methods: We have developed a flexible modelling and algorithmic framework to jointly analyze data from multiple sources, where we can incorporate prior information through constraints, define different types of relations between datasets, and take into account different data distributions, (ii) Time-evolving data analysis: We have introduced a flexible algorithmic framework for fitting a specific multi-way data analysis method that has the promise to capture time-evolving patterns. This model has also been incorporated into our data fusion framework, and extended to impose temporal smoothness on evolving patterns. (iii) Dynamic metabolomics data analysis (simulations): We have been using systems biology models (small-scale metabolic pathway models as well as a human whole-body metabolic model) to simulate dynamic metabolomics data, and explore the right type of data-driven approach to extract the underlying dynamics and metabolic mechanisms, (iv) Dynamic metabolomics data analysis and omics data fusion (real data): We have developed improved analysis methods for time-resolved metabolomics data, and demonstrated the promise of these methods in terms of capturing static and dynamic biomarkers of various phenotypes using the measurements from the COPSAC2000 cohort. Joint analysis of dynamic metabolomics data and other omics data sets is in progress, (v) Broader impact: We have used the developed methods to analyze real task fMRI (functional magnetic resonance imaging) data revealing a spatial network of potential interest (in terms of differentiating between healthy controls and patients with schizophrenia) as well as its change in time. We have also studied the performance of similar methods in terms of jointly analyzing fMRI data collected during multiple tasks. For our publications and ongoing activities, please see the project webpage: https://tracer.simulamet.no/ The ultimate goal of the project is to take significant steps in terms of developing the data mining methods needed to extract meaningful information from "personal data clouds" being collected in predictive and precision medicine studies, where pre-clinical longitudinal samples are collected from individuals to track their health status and detect early signs of diseases.

Data mining holds the promise to improve our understanding of dynamics of complex systems such as the human brain and human metabolome (i.e., the complete set of small biochemical compounds in the human body) by discovering the underlying patterns, i.e., subsystems, in data collected from these systems. However, discovering those patterns and understanding their evolution in time is a challenging task. The complexity of the systems requires collection of both time-evolving and static data from multiple sources using different technologies recording the behavior of the system from complementary viewpoints, and there is a lack of data mining methods that can find the hidden patterns in such complex data. The goal of this multidisciplinary project is to develop novel data mining techniques to jointly analyze static and dynamic data sets to discover underlying patterns, understand temporal dynamics of those patterns, and capture early signs of future outcomes. We will introduce a scalable and constrained data fusion framework that can jointly factorize heterogeneous data in the form of matrices and multi-way arrays, by incorporating temporal as well as domain-specific constraints. These methods will be motivated by a real, challenging system: the human metabolome, and used to jointly analyze static genetic information and longitudinal metabolomics data to discover interpretable patterns, i.e., subsystems corresponding to metabolic networks (networks of metabolites acting together), with the ultimate goal of understanding their role in the transition from healthy to diseased states. The project will play a significant role in terms of developing the data mining tools needed to extract meaningful information from the surge of data, referred to as "personal data clouds" being collected in predictive medicine studies, where participants give blood samples regularly to track their health status and will be alerted of early signs of diseases.

Publications from Cristin

No publications found

No publications found

No publications found

No publications found

Funding scheme:

IKTPLUSS-IKT og digital innovasjon