Back to search

HELSEVEL-Gode og effektive helse-, omsorgs- og velferdstjenester

Use of deep learning and Big Data in the Norwegian Breast Cancer Screening Program

Alternative title: Bruk av maskinlesing i Mammografiprogrammet

Awarded: NOK 8.0 mill.

Breast cancer is the most common cancer among women in Norway and worldwide. Preventing breast cancer is difficult, but early detection through mammographic screening is an effective way to reduce breast cancer deaths. The standard screening procedure in BreastScreen Norway takes x-ray images of each breast (mammograms) from two different angles. Two radiologists independently read all mammograms. If either of the radiologists identifies suspicious findings, a consensus meeting is held to decide whether the woman should be recalled for further assessment. Most women attending screening do not have any signs of breast cancer, 93% of the screening mammograms show no signs of breast cancer. As a result, today's radiologists spend a substantial amount of their time reading normal mammograms with no signs of breast cancer. With recent advancements in artificial intelligence, more specific machine learning, there is a potential to improve the screening program at a reduced cost, and thereby allowing radiologists to focus on women who may have signs of breast cancer. The aim of this project is to develop an automated model to read mammograms by combining machine-based image analysis with radiological knowledge and expertise. The project received a "pilot dataset" from the University Hospital of Northern Norway in 2018. In 2020 mammograms and screening information from St. Olavs hospital and Møre and Romsdal hospital trust were transferred to the Cancer Registry and the project, while in 2021 the project has received data from the University Hospital of Northern Norway. The majority of the data is to be collected from four hospital trusts in Helse Sør-Øst. After a lot of work, all the mammograms from this region have finally been transferred to the Cancer Registry which is currently processing the mammograms by ensuring the quality of the content and removing personally identifiable information. The dataset will be transferred to the projects data processor, the Norwegian Computing Center, shortly. Processes related to preparing for data collection, and receiving the data, have been more time consuming than expected due to legal aspects and procedures for extracting the data at some of the health trusts. The data received during 2020 and 2021 has opened the possibility of training models from scratch on the Norwegian data. Up to now we have had to rely on a pre-trained model. The new model has been compared with the earlier pre-trained model. The pre-trained model has the advantage of also having been trained on images annotated with the specific location of the cancers, while our Norwegian model only has access to annotations at image level. Still, the advantage of training directly on the Norwegian data is larger than that of having detailed annotations, as the comparison shows that the model trained from scratch on Norwegian data gives the best performance. The work that has been done in 2021 is very important for the project, and the model that the Norwegian Computing Center have trained from scratch can be further developed as soon as they receive larger amounts of data. We have also continued the work on plans for integration of artificial intelligence in mammographic screening, focusing on describing the characteristics and requirements of the methods and of the screening service that will affect the choices that can be made.

Breast cancer is the most common cancer among women in Norway and worldwide. Since the cause of breast cancer is not known, mammographic screening is offered as a secondary prevention, aimed at reducing the mortality from the disease. About 500 000 women have participated in the Norwegian Breast Cancer Screening every second year since the program was made nationwide in 2005. The radiologists spend a substantial time interpreting screening mammograms of healthy women, as about 7% of the exams are discussed at consensus, 3-4% are recalled for further assessment and 20% of those recalled, 0.6% of the attending women, are diagnosed with breast cancer and additional 0.17% are diagnosed before the next screening. By exploiting machine learning in the process the aim is to reduce the recall rate, the rate of missed screen-detected and interval breast cancer and obtain knowledge which can help us reducing overdiagnosis and overtreatment, which again will reduce the disease specific mortality. By achieving this goal, we will be able to reduce the human and financial burden of mammographic screening. A realistic ambition is that 100 women will get a breast cancer diagnosis 1-4 years earlier. An on-the-fly control of the image quality may reduce the number of recalls of 1 200 women annually and also improve the image quality in the further assessment. The project take advantage of three main factors: There has been a revolution in machine learning, also on medical images where machine learning together with experts is better than only human expertise. Our database with mammograms is at least 20 times larger than any published study. This is critical for machine learning. We will focus on questions that are relevant for the Norwegian Breast Cancer Screening Program. The project will build world leading competence which is also valuable for other screening programs and other medical applications.

Activity:

HELSEVEL-Gode og effektive helse-, omsorgs- og velferdstjenester