Back to search

IKTPLUSS-IKT og digital innovasjon

TROMBOLOME: Development and application of a comprehensive digital archive of the TROMsø study metaBOLOME

Alternative title: Utvikling og applikasjon av et digitalt arkiv over TROMsøundersøkelsens metaBOLOM

Awarded: NOK 14.3 mill.

Data-driven health research will provide more accurate health and disease profiles and better predictive models for disease progression and recovery. This will allow the healthcare service providers to offer tailored treatments that reduces economic and personal costs imposed by inefficient treatments. A prerequisite to develop solid, mathematic models is access to high-quality digital datasets of clinical data. Our vision is to make the metabolome from clinical samples a resource for data-driven health research. The primary goal of TROMBOLOME is to build a comprehensive digital archive of the metabolome in clinical samples. We wish to explore how as much data as possible from biological samples can be analyzed, digitalized, and finally applied in data-driven health research. A blood sample holds massive amounts of biochemical data that reflects the state of the organism. From immune system transmitters to the degradation products of our breakfast. Beyond cells, proteins, and lipids, blood also consists of thousands of small organic molecules, referred to as the metabolome. It is challenging to analyze the whole metabolome because of its chemical diversity and large differences in concentration levels. In the project so far, a practical workflow was developed to build analytical libraries. Pipetting was performed with an automated liquid handler based on chemical information-driven mixing of reference materials; this workflow has been applied in two different laboratories with a total of six analytical methods. Access to local libraries is necessary to elevate identification confidence of biochemical measurements. Analytical methods for comprehensive untargeted metabolome profiling of human serum samples have also been developed and validated. Datasets acquired so far is being applied for graph-based machine learning to support compound identification of analytes not already tested with reference material. Analytical raw data must be archived in an easy accessible manner to make scalable workflows that can compare the metabolome across thousands of samples. This will allow full control of results and subsequent data treatment processes. The TROMBOLOME database is being set up as a SQL database in the cloud, and the scrips needed to parse raw data from vendor file formats to the SQL database has been developed and will be published on GitHub. According to the project plan, it was anticipated that sample analysis should have commenced at this point. The biomarker committee of the Tromsø study earlier this year recommended our study to be given samples for analysis based on a modified analysis plan: This includes additional analyses with a targeted metabolomics panel. The original study design was based on only untargeted metabolomics. We have been facing challenges in convincing population study curators of the merits of untargeted metabolomics. There is a general lack of standardization and harmonization in the field of untargeted metabolomics, and the scientific community pays little attention to reproducibility. Our modified analysis plan with paired targeted and untargeted metabolomics analyses will bring important new insights into the field and establish method fitness-for-purpose in an unbiased manner. Such results are imperative to verify that metabolomics results can be reproduced by different methods. However, the additional targeted panel is costly, and for this reason we will analyze less samples (N=1000) in the study that first anticipated (N=5000) – but more in depth. The TROMBOLOME database will be used to find metabolomic markers associated with the risk of acute myocardial infarction. Recently, we were granted approval for our study by the regional ethic committee (REK-Nord). The study design will be a case-cohort study based on analysis of serum samples from the 7th Tromsø study. The case group consists of participants that had a first acute myocardial infarction after sampling for the 7th Tromsø study, and the control group will be randomly selected from the remaining participants. The last remaining approval needed is from the Data- and Publication Committee of the Tromsø study before the analyses can commence. All recruitment for the project has been completed. There has been a high degree of international activities for the project, including a three months international research stay (PhD student), presentations at international and national conferences, an Erasmus+ staff exchange, and management committee involvement in a COST action (AtheroNET). The potential benefits of the project include (i) setting new standards for digitalization and application of analytical raw data, (ii) to develop a method for streamlined annotation of biochemical components in the digital archive that can improve future analytical workflows, and (iii) to harvest a bigger potential of Norway's largest population study by adding these unique and vast data sets

The vision is to integrate small organic molecule (the metabolome) analytical data with biobank big data. This will be achieved by building an annotated digital archive of biological samples from the Tromsø population study. We propose a radically new approach by facilitating bottom-up metabolomics with full metabolome component annotation. The project is initiated in the context of the UN sustainable development goal: ‘Ensure healthy lives and promote well-being for all at all ages’. We strongly believe that coupling the metabolome onto the vast digital archives of health and diseases status, genomics, and additional well-curated big data sets in the Tromsø study, can through focused efforts open up new scientific opportunities in data-driven research on diagnostic and lifestyle markers and lead to radical breakthroughs. The main novelties in the project include significant methodological advancements for a) rational storage and organization of metabolome big data, and b) development of a complete multi-parametric virtual analytical method to perform automated large-scale metabolome annotation and define the borders of investigated chemical space. This will require basic research on statistical machine learning combined with applied deep machine learning. The project will set new standards for accessibility of metabolome data to stakeholders and push the frontier for metabolomics in data-driven health research. This unique, well-organized, veracious, and readily retrievable digital archive will allow harnessing a bigger potential of Norway’s largest population study and increase its competitiveness. A work package is dedicated to dissemination to relevant stakeholders to establish familiarity with TROMBOLOME’s merits. The project group is international and cross-disciplinary with experts in machine learning, cheminformatics, and metabolomics with the complementary skills necessary to answer the research questions and realize the vision.

Publications from Cristin

No publications found

No publications found

No publications found

Funding scheme:

IKTPLUSS-IKT og digital innovasjon