Back to search


Seismic data compression using constrained dictionary learning and imaging in compressed domain

Alternative title: Seismisk datakomprimering ved bruk av bundet maskinlæring og avbildning i det komprimerte domenet

Awarded: NOK 1.7 mill.

Project Manager:

Project Number:


Application Type:

Project Period:

2020 - 2023

Funding received from:


During a marine seismic survey, seismic wavefields are emitted near the sea surface and the response of the earth is monitored to deduce information about the geology of the subsurface. To monitor the response of the earth, data are acquired over time by densely spread sensors. Hence, the acquired data can be said to be in the time - space domain. The data are processed to produce an image of the subsurface, in which the locations of geological boundaries are highlighted. This processing involves the application of a long sequence of geophysical methods, and the whole workflow is referred to as seismic processing and imaging. Seismic processing and imaging essentially aims to remove the undesired signals from the recorded data and to reposition the seismic energy at the location from which it was diffracted or reflected in the subsurface. Most of the seismic processing methods are carried out in a different domain than the time-space domain. Transforming the data into another domain often requires preconditioning, which comes at a significant computational cost. After each processing step, the data are transformed back into the original time-space domain for quality control. Hence, seismic processing and imaging is a long meticulous workflow, which results in a high-quality image of the subsurface, but is very expensive in terms of human and computational resources. However, as widely accepted, the relevant information contained within seismic data is of smaller dimensionality than the data itself. Consequently, the seismic data can be expressed with a reduced number of parameters compared to the number of data samples by transforming the data into an appropriate compressed domain. During the first year of his PhD, the candidate has developed a method of seismic data compression that is based on "dictionary learning", i.e., a data-driven algorithm for sparse representation. He has tested the compression method on several datasets and has assessed its effectiveness in comparison to several standard seismic data compression methods. Among all the studied methods, the method he has developed was the most effective for reaching high compression ratios. During the second year of his PhD, the candidate has extended the method such that it kinematic wavefield parameters are simultaneously extracted from the seismic data as it is compressed. The kinematic wavefield parameters can be used by processing operators to apply steps of the processing workflow. Based on the kinematic wavefield parameters, the candidate has started designing operators to apply wavefield separation, i.e., an early step of the seismic processing workflow, in the compressed domain. Using those operators, the candidate has proposed a new method able to compress the data and apply one step of the seismic processing and imaging sequence in the compressed domain. During the third year of his PhD, the candidate has developed a method to apply deghosting in the compressed domain. He has further proposed in his thesis a new processing flow based on the three methods developed in his PhD studies to compress seismic data, and apply typical preprocessing steps in the compress domain. The proposed workflow lower the disk and data transfer requirements, and save human effort and time over the conventional workflow.

It is recognized in the industry that conventional seismic processing is not optimal regarding the amount of stored data and compute necessary to perform the different processing step of the conventional seismic processing sequence. The candidate has researched on a new mathematical domain that would store more efficiently the relevant information from the seismic data and enable carrying out the seismic processing steps easily in this domain. He has identified a mathematical domain where the seismic data are sparsely represented, therefore saving storage, and described with kinematic parameters, hence enabling some processing operators to be applied in that domain. The candidate has identified and proposed specific operators to perform two preprocessing steps of the seismic processing sequence, namely, wavefield separation and deghosting. In the current state, the prototype developed in this thesis can be used as an alternative to the conventional process to carry out preprocessing of the seismic data, and potentially save cost related to data storage and transfer. If continued further, this research can be the start of a complete new and innovative way of processing seismic data.

Marine seismic surveying aims to acquire adequately sampled data according to Shannon's sampling condition both, in time and space. This aim is due to common requirements of conventionally used processing and imaging tools. Although the spatial sampling condition is not always achieved, this leads, nevertheless, for larger surveys to challenges of data handling, storage, processing and imaging. Based on characteristic wavefield information, such data is highly redundant and in a previous industrial PhD project, we have exploited this fact to reconstruct missing data in the crossline direction using constrained dictionary learning. In this project, we will derive and develop a new method of dictionary learning, constrained by Hamilton's characteristic function, with the objectives to 1) extract the kinematic wavefield parameters of seismic data and 2) represent the data in a sparse domain by preserving the inherent wavefield information. In the most general form Hamilton's characteristic function for reflection seismic data is defined by two slowness vectors and three traveltime curvature matrices, a total of 14 kinematic wavefield parameters, associated to one reference position (central ray position). Starting from the new dictionary learning we develop a method for data compression in order to facilitate seismic data storage, and finally by fully exploiting the inherent wavefield information, we build new processing and imaging methods which are specially tailored to run directly in the compressed parameter domain.

Funding scheme: