2025 Poster Presentations
P189: EVALUATION OF NAMED ENTITY RECOGNITION FOR AUTOMATED EXTRACTION OF PRESENT TUMOUR SIZE AND PERSONAL NAMES FROM RADIOLOGY REPORTS USING SPACY
Lorena Garcia-Foncillas Macias1; Theodore Barfoot1; Tom Vercauteren1; Jonathan Shapey, PhD, FRCS (Neuro.Surg)2; 1School of Biomedical Engineering & Imaging Sciences, King's College London; 2Department of Neurosurgery, King's College Hospital NHS Foundation Trust
Analysing tumour growth rates from MRI scans is crucial in diagnosing and treating meningiomas. Tumour sizes are typically documented in radiology reports, comprising unstructured text. Natural Language Processing (NLP) can be used to extract structured information. De-identifying clinical records by locating and processing personal names is also essential for collaborative research. This study explored training a Named Entity Recognition (NER) model from the SpaCy library[1] to extract present tumour sizes and personal names from radiology reports.
16,856 radiology reports from patients diagnosed with meningioma at King’s College Hospital between 2011 and 2020 were obtained. For this study, we excluded duplicates, resulting in 9,175 radiology reports. From this dataset, a random subset of 400 reports was manually annotated to identify present tumour size and personal names of radiologists and patients. Cross-validation was conducted on 85% of the annotated reports, while the remaining 15% were reserved for testing. Additionally, for the task of identifying personal names, we compared a specifically trained NER model for extracting such entities with a rule-based algorithm, our baseline. The performance of both methods was evaluated at a token level (i.e. word and punctuation level) using precision (ratio of true positive predictions over all positive predictions), recall (ratio between true positive predictions and all positive ground truths), and F1-score (harmonic mean of precision and recall). Additionally, we assessed the NER models and the rule-based algorithm at a report level, evaluating the accuracy of perfect prediction.
When comparing the performance between the rule-based approach and the specifically trained NER model for name extraction, the rule-based method achieved a macro-averaged token-level precision of 0.96, recall of 0.99, and F1-score of 0.975. In contrast, the NER model attained 0.998 ± 0.004 for each token-level metric. The report-level accuracy of perfectly predicted annotated reports was 0.836 for the rule-based method, compared to 0.971 ± 0.032 for the NER model. Moreover, when a model was trained for both tasks—identifying present tumour size and name extraction—the performance slightly decreased compared to the name extraction-specific model. In this case, the NER model exhibited a macro-averaged token-level precision of 0.948 ± 0.029, recall of 0.97 ± 0.021, and F1-score of 0.958 ± 0.011. At the report level, 0.817 ± 0.029 of the reports were perfectly predicted against our ground truth of manual annotations.
These findings illustrate the potential for training models on highly heterogeneous data to tackle the complex task of extracting tumour sizes from radiology reports and develop automated pipelines for analysis. This automation could identify patients with increased tumour growth or changes in diagnosis, facilitate follow-up assessments, and contribute to creating more structured data. Additionally, this project is a significant step towards the automated generation of radiology reports from MRI scans.
A)
B)
A), B) Labelled radiology reports with ground truth (GT) and predictions (PRED) from the SpaCy model for present tumour size and personal name extraction. Identifying information is replaced with “[PERSON]” and “[DATE]” placeholders for privacy.
[1] “spaCy · Industrial-strength Natural Language Processing in Python.” Accessed: Jul. 26, 2024. [Online]. Available: https://spacy.io/