ICD-10 diagnosis codes play an important role both in Norwegian and Swedish hospitals as well as worldwide. After each patient contact the physician must register one or more diagnoses or procedure codes that describe what kind of assessment or treatment the patient has received. The ICD-10 coding system includes over 30,000 codes and can both be difficult and time consuming to use. Often, clinicians will register incorrect codes, or not include all appropriate codes. The registered codes are used at a higher level to measure activity in the hospitals. Incorrect and inadequate coding makes it difficult to make the billing correct and efficiently plan the resources in the healthcare system. In the ClinCode project, we will investigate how Computer-Assisted Coding (CAC) can increase the quality of ICD-10 coding without adding to the workload of the clinicians. The hospitals' electronic patient records already contain extensive material of manually coded discharge letters. In this project we will study the medical specialty gastric surgery. The patient records will be used to develop a computer program that can read the individual patient's record and automatically suggest appropriate ICD-10 codes to the responsible clinician. The computer program will analyze both free text notes and structured data that are already manually coded with ICD-10 codes, using natural language processing and deep learning or what also is called Artificial Intelligence methods and learn from this data. The ready CAC program will be integrated into DIPS Arena's electronic patient record system.
Work carried out 2021.
A Swedish gastro dataset has been extracted containing 6 000 discharge summaries and their manually assigned ICD-10 diagnosis codes. This dataset has been used to train a Swedish deep learning BERT language model. A first CAC prototype and demonstrator (ICD-10-coder) for Swedish has been constructed. The results are described in three published articles. One of the papers contains experiments carried out in parallel with comparable Spanish and Swedish discharge summaries.
The Swedish gastro dataset has been de-identified, this dataset Stockholm EPR Gastro ICD-10 Corpus has been shared with the research group in Tromsø, Norway.
Stockholm EPR Gastro ICD-10 Corpus is currently recoded by an ICD-10 expert coder to improve the training data.
An ethical application for Norwegian gastro data has been approved and we are currently waiting for the Norwegian hospital to decide on the access of the data for the project.
This project will develop a Computer-Assisted Coding (CAC) tool for ICD-10 coding for Norwegian electronic health records and specifically for the discharge letter. There are over 20 000 ICD-10 diagnosis codes for Norwegian divided into 22 chapters. The codes are hierarchical in 3 levels and each code has a textual description. One or several of these ICD-10 codes are assigned to the patient's discharge summary by the physician both for medical and for administrative purposes. The process of assigning codes is difficult and time consuming and it is also shown that up to 41 percent of the manually assigned main diagnosis maybe wrong or sometimes missing.
The CAC tool will learn from previously manually coded discharges summaries, patient notes both free text and structured information such as laboratory results, blood values, etc and assign ICD-codes to unseen discharge summaries. The CAC tool will use Artificial Intelligence methods such as Natural Language Processing and Deep Learning techniques to learn and predict codes. Ranked ICD-10 code suggestions will be presented to the physician such that he or she can can select among them and assign the correct code.
This will enable fast and high quality semi automatic ICD-10 coding. The CAC tool can also be used for assessing coding quality on historical data for hospital management and health authorities.
The CAC tool will reduce coders workload and improve overall code quality. High-quality codes enable efficient data reuse, promoting fast knowledge generation in healthcare, thereby laying foundations for personalized medicine, more efficient health management, and, subsequently, higher quality of care.
The project builds on the clinical text mining research activities started in the incubator project, NorKlinTekst (HNF1395-18), funded by Helse Nord in 2017.