Back to search

IKTPLUSS-IKT og digital innovasjon

ShuttleNet: Scalable Neural Models for Long Sequential Data

Alternative title: ShuttleNet: Skalerbare nevrale modeller for lange sekvensielle data

Awarded: NOK 7.6 mill.

We consider data driven artificial intelligence over long sequences. Many real-world data are intrinsically sequential, for example, text, speech, music, time series, DNA sequences and unfolding of events. However, conventional data science methods can process only short sequences up to a few thousand steps. In this project we develop a scalable method which enables efficient and accurate inference for very long sequences up to millions or even billions of steps. At the end of the project, we will deliver theoretical breakthroughs such as new models with guarantees, as well as practical outcomes such as computer software and visualization tools. Our research findings will be applied to two focus areas: 1) microbiology and infectious disease epidemiology and 2) remote sensing pattern recognition. Moreover, because long sequential data are commonly available in many areas, our method can be applied as a critical component in a wide range of tasks including scientific research, medical and health service, natural language processing, financial data analysis, market studies, etc. During 01.12.2022 - 17.9.2023, we have achieved the following: - Both planned PhD students have passed mid-term evaluation and are performing second-half research. - A research visit to ICLR2023 was in May 2023. Several collaboration visits took place in the summer of 2023 to Oslo and to China. - The proposed neural network has been applied to various data types, including text, vision, and DNA sequences, with results published in Level-2 conferences or journals. - Our methods have achieved substantial improvement over previous deep-learning approaches in terms of accuracy and scalability. - From the start of the project, 22 relevant papers have been accepted or published. Three papers are in submission to AAAI 2024, Neurocomputing, and Neural Networks, respectively.

The project has made significant advancements in the field of machine learning methods for handling long sequences, both in terms of scalability and accuracy. Prior to the project, existing modeling methods were limited to processing sequences of only a few hundred steps. By the conclusion of the project, other established approaches were capable of handling sequence lengths of up to several thousand steps. In stark contrast, our method exhibits remarkable scalability, demonstrating successful inference for sequences as extensive as 1.5 million steps. Furthermore, our method has delivered substantial improvements in accuracy across a wide range of inference tasks, spanning synthetic data, text documents, images, and DNA sequences. It even outperformed Google's Enformer model in genetic variant prediction, achieving this without the need for gene expression supervised data. Additionally, our method surpassed the top competitor in open chromatin region detection with a mere 1% of supervised labels. These proposed neural attention models can serve as the foundation for networks and can be applied to various pattern recognition and generation tasks. The groundbreaking research achieved in this project promises to pave the way for even more disruptive applications in the fields of data science and industry.

In the past decade Machine Learning (ML), especially deep learning, has brought us many successful data-driven AI applications. Many real-world data are intrinsically sequential, for example, text, speech, music, time series, DNA sequences and unfolding of events. However, conventional deep learning methods can process only short sequences up to a few thousand steps. The existing approaches often face challenges like slow inference, vanishing (and exploding) gradients and difficulties in capturing long-term dependencies. In this project we develop a scalable machine learning method which enables efficient and accurate inference for very long sequences up to millions or even billions of steps. At the end of the project, we will deliver a versatile ML framework based on deep neural networks, as well as its efficient optimization algorithms, computer software, and visualization tools. Our research findings will be applied to two focus areas: 1) microbiology and infectious disease epidemiology and 2) remote sensing pattern recognition. Moreover, because long sequential data are commonly available in many areas, our method can be applied as a critical component in a wide range of tasks including scientific research, next-generation DNA sequence analysis, natural language processing, financial data analysis, market studies, etc.

Publications from Cristin

No publications found

No publications found

No publications found

Funding scheme:

IKTPLUSS-IKT og digital innovasjon