Back to search

IKTFORSKNING-IKTFORSKNING

AI-Powered Testing Infrastructure for Cancer Registry System

Alternative title: Bruk av AI i testing av infrastruktur i et kreftregister

Awarded: NOK 6.9 mill.

The Cancer Registration Support System (CaReSS) at the Cancer Registry of Norway has been handling information on cancer since 1952. The system has gone thru several upgrades and are now fully digital. All health personnel diagnosing or treating cancer patients are obliged by law to report to the Cancer Registry, and in addition data from other registries are collected. All patient information is submitted to the system and curated by trained medical coders. They create patient histories by piecing together all this information into patient histories, which are timelines of the patient's diagnostic workup, treatments, and follow-up. Hundreds of rules have been defined to validate the data. These rules are manually reviewed to ensure that patient histories are correct. New rules are constantly introduced, and existing rules are frequently revised due to new medical findings. Dependencies between rules, such as ordering and timing, also exist. The automated checking of rules is an ideal solution to improve the quality of patient history. However, not only are the rules changing over time, but also the data as diagnostics and treatment are improved. This leads to the continuous evolution of the CaReSS's key software components, to ensure that such data structures and rules are correctly specified and implemented. Thus, there is a need for a cost-effective, systematic, and automated testing layer, i.e., new testing methods implemented in a software testing tool and a test execution infrastructure. The project's objective was to remove the manual effort in ensuring the validity and quality of the evolving CaReSS and, thereby, improving the quality and reliability of the produced data and statistics for its stakeholders. We achieved this through automatic and cost-effective testing and modeling of CaReSS and its evolution. The research and its results were disseminated at international conferences. The research was also published in the top software engineering conferences and journals.

We implemented a generic infrastructure to deploy, run, and analyse test suites for the CRN. As testing is currently done manually at the CRN, any kind of automatic testing solution improves over the current practice. We demonstrated that current state-of-the-art solutions from research, even though they improve over the CRN's current practice, they are hardly effective in testing the medical rules in CaReSS. We developed a targeted solution, i.e., EvoGURI, and showed that it is highly effective in testing 58\% and 70\% of the rules with a passing and failing result, respectively. Improving on the CRN's current API specification, we can improve the testing effectiveness by +50 percentage points (pp) and +60 pp for passing and failing rules, respectively, when using EvoGURI. We further developed a methodology to assess how effective the tests are in finding errors and found that we can find 96\% of such errors when using EvoGURI. We explored an alternative approach using large language models (LLMs) with the aim of testing with more realistic data and showed that the success rate of creating realistic inputs is 82\% and 66\% for passing and failing rules, respectively, at a relatively low cost of 15s. With these realistic inputs, we are able to detect up to 14 and 11 potential errors in rules that should pass and fail, respectively, in CaReSS. In terms of efficiency, we demonstrate that by adding a traditional machine learning (ML) and quantum ML classifier to EvoMaster (EvoGURI's underlying testing tool), we improve efficiency and reduce execution cost by 31\% while still achieving the same effectiveness. It is important to emphasize that this cost reduction is compared to a fully automatic solution, which has not existed at the CRN. Comparing these to the manual cost of manually creating a test of 30m per rule (113 rules in the newest version would require 56.5h), our solutions require only at most 1h to achieve these results.

The Cancer Registry of Norway (CRN) collects data about cancer patients, e.g., about diagnostic, treatment, and follow-up, and provide this data and statistics to its end users, e.g., researchers, patients, doctors, and health authorities. Decisions, regarding how this data should be coded rely on a semi-automated and interactive decision support system, named as Cancer Registration Support System (CaReSS). The system uses a patient’s test results and treatments, and makes decisions, based on medical coding rules, often using machine learning. CaReSS evolves due to, e.g., addition, deletion, and modification of rules due to new treatments, improved diagnostics, new medical results and tests, and new diagnostic standards. Also, CRN continuously updates CaReSS with advanced versions of machine learning algorithms. Thus, the implementation of CaReSS undergoes continuous change and warrants continuous, cost-effective testing of CaReSS as it evolves. A well-test CaReSS will prevent the system from producing inaccurate statistics and data to its end users. Inaccurate or imprecise data produced by CaReSS have significant adverse effects on the scientific results produced by researchers. Also, inaccurate or imprecise statistics produced by the CaReSS will significantly impact the decisions made by patients, hospitals, and policymakers. The innovation planned is a state-of-the-art test infrastructure including new testing techniques to support cost-effective and systematic testing of the CaReSS to significantly improve its quality, and quality of data and statistics it produces, by dealing with the continuous evolution and unpredictable behavior of machine learning algorithms. This will positively affect all its end users, including researchers, patients, doctors, and government officials. The deployment of the new testing infrastructure at CRN will lead to significant improvements in the current testing practice at CRN.

Publications from Cristin

No publications found

Funding scheme:

IKTFORSKNING-IKTFORSKNING