IKTPLUSS-IKT og digital innovasjon

In terms of disease diagnosis, our immune system is the best doctor we know. It carries out disease diagnosis with unmatched precision before any clinical symptoms arise. The past and ongoing battles with disease and infection are recorded in the form of immune memory, which is composed of a repertoire of immune cells that bear immune receptors that specifically recognize and neutralize invading pathogens. We are now able to read these immune receptors using high-throughput DNA sequencing on small blood samples at a cost realistic for clinical use. However, we are not yet capable of understanding what we read. Specifically, we are not yet able to translate an immune receptor's DNA sequence to which disease states are reflected by these immune receptors. Since the immune system is continuously responding to the presence of pathogens and other factors, learning the link between immune receptors and disease would enable continuous monitoring of disease state throughout life. The pattern recognition capacity of machine learning makes it uniquely suited for translating the immunological sensor system into a human-readable account of disease. However, rather than lending themselves to the application of existing machine learning methods, immune repertoires have particularities that call for conceptual advancements in machine learning methodology. We have in the Doctor AI2 project already made good progress in four complementary directions: 1) we have completed and published a manuscript on a machine learning platform "immuneML" for immune repertoires. We have already presented this in many arenas - both scientific and popular science arenas. We are now embarking on a further expansion where we expand the current focus on classifying what immune cells recognize to include machine learning to generate new candidates that the models believe will recognize a particular antigen. 2) we have made good progress related to spatial modeling of interaction between immune receptors and antigens, where several manuscripts have been published. 3) the analysis of large datasets of sequence data in the project has inspired the development of a new software package "bionumpy" for efficient processing and analysis of sequencing data, where we are in the process of finalizing a manuscript about the software package. 4) we have made interesting theoretical observations related to multiple-instance learning, which is a central methodological approach in the project to model the immune state for large repertoires of immune receptors 5) the work on the platform immuneML has provided insights into how underlying causality and study setup can affect what is learned through machine learning, where an article about this has recently been accepted in a well regarded journal.

Early diagnosis of disease is key to optimal treatment and in terms of diagnosis, our adaptive immune system is the “best doctor”. It carries out diagnosis with unmatched precision before clinical symptoms arise. There is a gold rush in academia and industry to develop artificial intelligence (AI) methods that exploit our immune system’s capacity to assist doctors in everyday diagnosis. The adaptive immune system records each past and ongoing battle with disease. This immune memory is recorded by “immune receptors” - short genetic sequences specific for each disease. Immune receptors can today be sequenced at high-throughput. We have previously shown that similar immune receptors (similar: identity of entire genetic sequence or subsequence) arise in different individuals when faced with the same disease. Thus, the pattern recognition capacity of machine learning may be leveraged to detect disease-associated patterns in the genetic sequences of immune receptors. However, so far, machine-learning-based exploitation for immunodiagnostics of immune receptor sequence datasets has been rather poor. This is due to (1) a lack of machine learning approaches that can exploit the unique biological characteristics of immune receptor repertoires, (2) and a lack of ground truth data for machine learning benchmarking.(3) There exists currently no platform for the machine learning analysis of large-scale immune receptor datasets. To resolve these knowledge gaps, we propose to develop novel AI methodology and implement a comprehensive software platform for immune receptor-based diagnostics. To validate our framework, we have access to the world-wide largest experimental and synthetic immune receptor datasets. In the medium-term horizon, the transdisciplinary project Doctor AI^2 will move the research frontier in AI techniques for immune-receptor immunodiagnostics and contributes to the AI revolution in medicine by supporting clinicians in therapeutic decision making.

Publications from Cristin

Assessing developability early in the discovery process for novel biologics

Defining and studying B cell receptor and TCR interactions

simAIRR: simulation of adaptive immune repertoires with realistic receptor sequence sharing for benchmarking of immune state prediction methods

Toward real-world automated antibody design with combinatorial Bayesian optimization

TCRpower: quantifying the detection power of T-cell receptor sequencing with a novel computational pipeline calibrated by spike-in sequences

In silico proof of principle of machine learning-based antibody design at unconstrained scale

Profiling the baseline performance and limits of machine learning models for adaptive immune receptor repertoire classification

Deciphering how adaptive immune cells recognise pathogens: gathering suited data, defining appropriate assessments and incrementally improving machine learning methodology

Access to ground truth at unconstrained size makes simulated data as indispensable as experimental data for bioinformatics methods development and benchmarking

No publications found

Funding scheme:

IKTPLUSS-IKT og digital innovasjon

2.6BILL. NOKtotal funding in the programme period 658PROJECTShave received funding in the programme period 8SOURCEShave financed the programme

Funding Sources

Kunnskapsdepartement Justis- og beredskap Kommunal-og distrikt Samferdselsdeparteme Diverse Nærings- og fiskerid Forsvarsdepartemente Digitaliserings- og

Thematic Areas and Topics

Grunnforskning Digitalisering og bruk av IKT Bioteknologi Medisinsk bioteknologi Helse Bioteknologi Bransjer og næringer Helsenæringen Bransjer og næringer IKT-næringen LTP3 IKT og digital transformasjon Politikk- og forvaltningsområder Internasjonalisering LTP3 Helse Digitalisering og bruk av IKT eVitenskap LTP3 Et kunnskapsintensivt næringsliv i hele landet IKT forskningsområde Portefølje Helse LTP3 Styrket konkurransekraft og innovasjonsevne LTP3 Muliggjørende og industrielle teknologier Portefølje Innovasjon Anvendt forskning Portefølje Muliggjørende teknologier Delportefølje Et velfungerende forskningssystem Politikk- og forvaltningsområder Helse og omsorg Internasjonalisering Internasjonalt prosjektsamarbeid Portefølje Forskningssystemet LTP3 Nano-, bioteknologi og teknologikonvergens Politikk- og forvaltningsområder Forskning Delportefølje Internasjonalisering LTP3 Fagmiljøer og talenter Bransjer og næringer LTP3 Høy kvalitet og tilgjengelighet Portefølje Banebrytende forskning Politikk- og forvaltningsområder Digitalisering IKT forskningsområde Visualisering og brukergrensesnitt IKT forskningsområde Kunstig intelligens, maskinlæring og dataanalyse Delportefølje Kvalitet

IKTPLUSS-IKT og digital innovasjon

Doctor AI2 – Artificial Intelligence mining of the Adaptive Immune system to develop an immunodiagnostics platform

Alternative title: Doktor AI2 - Maskinlæring på det adaptive immunsystemet for å utvikle en plattform for immunbasert diagnostikk

Awarded: NOK 8.8 mill.

Popular Science Description

Summary

Publications from Cristin

Assessing developability early in the discovery process for novel biologics

Defining and studying B cell receptor and TCR interactions

simAIRR: simulation of adaptive immune repertoires with realistic receptor sequence sharing for benchmarking of immune state prediction methods

Toward real-world automated antibody design with combinatorial Bayesian optimization

TCRpower: quantifying the detection power of T-cell receptor sequencing with a novel computational pipeline calibrated by spike-in sequences

In silico proof of principle of machine learning-based antibody design at unconstrained scale

Profiling the baseline performance and limits of machine learning models for adaptive immune receptor repertoire classification

Fra bare én blodprøve kan kunstig intelligens gi diagnose for mange ulike sykdommer

Immunforsvaret har skjulte mønstre om sykdom og infeksjoner

Linguistically inspired roadmap for building biologically reliable protein language models

Deciphering how adaptive immune cells recognise pathogens: gathering suited data, defining appropriate assessments and incrementally improving machine learning methodology

Hvorfor er maskinlæring viktig i medisin?

Profiling the specificity of adaptive immune receptor repertoires

Machine-learning-based antibody design

Towards targeted computational and machine-learning- based antibody specificity and developability design

AI-based prediction of the adaptive immune response

Access to ground truth at unconstrained size makes simulated data as indispensable as experimental data for bioinformatics methods development and benchmarking

Funding scheme:

IKTPLUSS-IKT og digital innovasjon

Funding Sources

Thematic Areas and Topics