Back to search

FRINATEK-Fri prosj.st. mat.,naturv.,tek

Secure and Reliable Distributed Storage Systems

Alternative title: Sikre og pålitelige distribuerte lagringssystem

Awarded: NOK 8.0 mill.

Distributed storage is the scalable and economically viable technology for storing our collective memory. It is unknown how to optimally design distributed storage systems that are both robust against arbitrary failures, and secure against determined attacks. The project addresses these issues through a theoretical approach guided by practical concerns. Due to the vast amounts of data being generated and accessed worldwide, the demand for large-scale data storage has increased dramatically during recent years. Data centers typically employ cheap commodity hardware connected in a distributed storage system in order to scale massively at low cost. Examples of existing distributed storage systems are OceanStore and Google File System (GFS). The cheap components suffer from frequent failures, and software glitches, machine reboots, local power failures and maintenance operations also contribute to devices being rendered unavailable from time to time. Thus, resilience to failures of individual components is an essential property of a distributed storage system. Traditionally, this resilience is provided by replication across multiple machines. For instance, GFS and the Hadoop Distributed File System store three copies of all data by default. On a massive scale of operation, storing multiple copies of all files is expensive and inefficient, and hence data centers are increasingly using more sophisticated coding-theoretic techniques. The project has mainly focused on the design and analysis of improved private information and function retrieval schemes for distributed storage systems. So-called secure repairable fountain codes have also been proposed and analyzed, as well as a construction of a new family of erasure correcting codes for distributed storage that yield low repair bandwidth and low repair complexity.

This was a project devoted to basic research guided by practical guidelines. The results have been published (or will be submitted for publication) in international journals and at international conferences. The project has resulted in significant increased international research collaboration. Lately, the Cambridge Analytica and Facebook data scandal also reveals how important the privacy and security of our personal information stored in the cloud are in the world of social media. Thus, we believe that the results obtained within this project will have benefits for the whole society in a long-term perspective.

Distributed storage is the scalable and economically viable technology for storing our collective memory. It is unknown how to optimally design distributed storage systems that are both robust against arbitrary failures, and secure against determined attacks. The project addresses these issues through a theoretical approach guided by practical concerns. Since the solutions are currently unknown, this is a high risk project, but the topic is of vital global and national importance, and the potential benefits are significant. Due to the vast amounts of data being generated and accessed worldwide, the demand for large-scale data storage has increased dramatically during recent years. Data centers typically employ cheap commodity hardware connected in a distributed storage system in order to scale massively at low cost. Examples of existing distributed storage systems are OceanStore and Google File System (GFS). The cheap components suffer from frequent failures, and software glitches, machine reboots, local power failures and maintenance operations also contribute to devices being rendered unavailable from time to time. Thus, resilience to failures of individual components is an essential property of a distributed storage system. Traditionally, this resilience is provided by replication across multiple machines. For instance, GFS and the Hadoop Distributed File System store three copies of all data by default. On a massive scale of operation, storing multiple copies of all files is expensive and inefficient, and hence data centers are increasingly using more sophisticated coding-theoretic techniques. In this project we will use our strong background in coding and information theory to address the design of secure and reliable distributed storage systems. We will contribute to the international research frontier in theory and practice. Furthermore we will develop a national competence in this important field by training students at the graduate and postgraduate level.

Publications from Cristin

No publications found

No publications found

No publications found

No publications found

Funding scheme:

FRINATEK-Fri prosj.st. mat.,naturv.,tek