Back to search

FRIPROSJEKT-FRIPROSJEKT

Learning from Deep learning

Alternative title: Læring fra Dyp Læring

Awarded: NOK 9.6 mill.

How do neural networks utilize plain microscopic images of cancer tissue to predict outcome for patients years later, and doing so better than all established prognostic markers? What features of the cancer tissue is revealing the patient’s outcome, and what does this tell on the biology of metastases? The primary objective of this project is to provide, at least in part, answers to these questions. The focus of WP1 is to perform tissue profiling by multiparametric biomarker assessments to be aligned with the neural network predictions on a cell- by-cell level in both colorectal and prostate cancer (CRC and PCa). During this first year of the project, we have selected 455 prostatectomies from patients with distinct good/poor prognosis. These have been sectioned and stained with both HE and Feulgen-staining, documented on Aperio, XR and HighRes-scanners. The HE stained sections are the objects for our DoMore classifier, but also provide structural information for tumour identification, histological classification, quantification of stroma/epithelium ratio and mitotic count. Feulgen is a DNA-specific stain that e.g., allows the segmentation of cell nuclei, quantification of DNA content, nucleoli and chromatin organization. A parallel section has been prepared for HE staining to be followed by sequential immune staining. For PCa we apply a targeted approach where relevant prognostic targets involved in genomic instability, cell division and genes known to be altered in cancers as well as genes with an established role in PCa was quantified at transcript and protein expression level. Those showing prognostic relevance have now been validated in external patient cohorts, resulting in the selection of 8 candidate targets for sequential staining, and a protocol for sequential immune staining of these targets has been optimized. In a parallel tissue block, we have performed low pass (30 GB) whole genome sequencing. Amplifications and deletions in genomic regions that are reported to be associated with patient outcome will be validated. For the CRC material we have chosen a more explorative approach, where the transcriptome will be analysed in situ using spatial biology techniques. This will allow in-depth analysis of the regions that are categorized as prognostically important by the DoMore classifier. We are currently performing pilots to confirm the appropriate platform. In WP2 we i.a. aim to develop and train nuclei based prognostic markers and investigate how increased image resolution affect the marker’s accuracy. We have segmented cell nuclei in both 40x and 60x resolution. The segmentation of nuclei in 60x Feulgen scans produced accurate results that will be useful for Nucleotyping and DNA ploidy analyses. The segmentation in 40x HE scans resulted in a greater number of nuclei but with coarser segmentations, which have been used to generate approximately 87 million high-resolution images of cells from 60x HE scans. These will be used for the training prognostic markers, after examining image quality. Training of prognostic markers with high resolution images is based on 60x images (tiles). We have generated tiles at 60x, 40x, and 10x resolutions from the HighRes 60x HE scans, as well as 40x and 10x tiles from the corresponding Aperio 40x HE scans, for comparison. Previously trained DoMore networks have been applied directly to these tiles without retraining. There were considerable variation in score values between the two scanners. The next step is to retrain the DoMore classifier on 60x resolution tiles. In WP3 we use the tools provided by WP2 to explain the DoMore classifier using biological information obtained in WP1. We have developed a new method to align slides with cell-to-cell accuracy. It works across different scanners, resolutions, stains and even different formats. It operates by detecting the same features in the two scans and elastically stretching one image to fit the other. Each cell is analysed relative to all available measurements and create reports and statistics, including counts for cells in mitosis, Immune scores, and whether the cell is in stroma, tumour or background areas. Heatmaps can be generated for Ploidy and Immune scores. As more methods are developed, their results will be included in the generated statistics and reports. HistoTracker, previously MicroTracker, has been optimized for speed so all operations are quick and responsive for intuitive interactive what-if sessions. Very compact binary data formats have been devised to load the large amounts of data necessary as quickly as possible. All functionality of HistoTracker has now been integrated into our general Image viewer application: SeeMore. The SeeMore application has mechanisms for loading modular add-ons for performing specific tasks, like using neural networks for finding cells in mitosis or detecting out-of-focus areas.

A rapidly increasing number of publications now demonstrate high performance of convolutional neural networks in medical diagnostics. However, few of these systems have reached the clinic, an important reason being their “black box” nature - the basis of their predictions is not traceable by humans. We have recently developed the DoMore-v1 classifier, a deep learning system for predicting patient outcome in colorectal cancer (Skrede et al., Lancet 2020;395:350-360). When independently tested on 1122 patients, the classifier outperformed all current prognostic biomarkers. However, an intriguing question remains: How can neural networks utilizing plain microscopic tissue images to predict patient outcome years later? Recent developments have provided methods that enhance our ability to visualize image areas of particular importance to network predictions. However, it seems unlikely that satisfactory understanding can be obtained without supplementing such information with concrete biomedical information, including biochemical measurements on cellular level. This latter information must be provided on image form, aligned to the images showing areas of particular importance to outcome predictions. Thus, the first work-packages in the project develop methods for simultaneous displaying various biochemical markers in pathological images and further develop tools for identifying the same cells in different images. The next packages aim at adapting and testing visualization methods to make them suitable for revealing features in pathological images utilized by prediction networks, while the last project activity is to connect important image characteristics and the biomedical markers. Through this combined biological and machine learning approach, we intend to provide methods to make our own and similar networks more transparent and thus easier to use for clinicians, as well as to improve our understanding of the biological mechanisms underlying metastatic disease.

Funding scheme:

FRIPROSJEKT-FRIPROSJEKT

Funding Sources