Automatic structuring of radiology reports with on-premise open-source large language models

Piotr Woźnicki; Caroline Laqua; Ina Fiku; Amar Hekalo; Daniel Truhn; Sandy Engelhardt; Jakob Kather; Sebastian Foersch; Tugba Akinci D’Antonoli; Daniel Pinto dos Santos; Bettina Baeßler; Fabian Christopher Laqua

doi:10.1007/s00330-024-11074-y

Automatic structuring of radiology reports with on-premise open-source large language models

Research output: Contribution to journal › Research article › Contributed › peer-review

Contributors

Piotr Woźnicki - , University Hospital of Würzburg (Author)
Caroline Laqua - , University Hospital of Würzburg (Author)
Ina Fiku - , University Hospital of Würzburg (Author)
Amar Hekalo - , University Hospital of Würzburg (Author)
Daniel Truhn - , University Hospital Aachen (Author)
Sandy Engelhardt - , Deutsches Zentrum für Herz-Kreislaufforschung (DZHK), University Hospital Heidelberg (Author)
Jakob Kather - , Department of Internal Medicine I, Else Kröner Fresenius Center for Digital Health, National Center for Tumor Diseases (NCT) Heidelberg (Author)
Sebastian Foersch - , University Medical Center Mainz (Author)
Tugba Akinci D’Antonoli - , Cantonal Hospital Baselland (Author)
Daniel Pinto dos Santos - , University of Cologne, University Hospital Frankfurt (Author)
Bettina Baeßler - , University Hospital of Würzburg (Author)
Fabian Christopher Laqua - , University Hospital of Würzburg (Author)

Abstract

Objectives: Structured reporting enhances comparability, readability, and content detail. Large language models (LLMs) could convert free text into structured data without disrupting radiologists’ reporting workflow. This study evaluated an on-premise, privacy-preserving LLM for automatically structuring free-text radiology reports. Materials and methods: We developed an approach to controlling the LLM output, ensuring the validity and completeness of structured reports produced by a locally hosted Llama-2-70B-chat model. A dataset with de-identified narrative chest radiograph (CXR) reports was compiled retrospectively. It included 202 English reports from a publicly available MIMIC-CXR dataset and 197 German reports from our university hospital. Senior radiologist prepared a detailed, fully structured reporting template with 48 question-answer pairs. All reports were independently structured by the LLM and two human readers. Bayesian inference (Markov chain Monte Carlo sampling) was used to estimate the distributions of Matthews correlation coefficient (MCC), with [−0.05, 0.05] as the region of practical equivalence (ROPE). Results: The LLM generated valid structured reports in all cases, achieving an average MCC of 0.75 (94% HDI: 0.70–0.80) and F1 score of 0.70 (0.70–0.80) for English, and 0.66 (0.62–0.70) and 0.68 (0.64–0.72) for German reports, respectively. The MCC differences between LLM and humans were within ROPE for both languages: 0.01 (−0.05 to 0.07), 0.01 (−0.05 to 0.07) for English, and −0.01 (−0.07 to 0.05), 0.00 (−0.06 to 0.06) for German, indicating approximately comparable performance. Conclusion: Locally hosted, open-source LLMs can automatically structure free-text radiology reports with approximately human accuracy. However, the understanding of semantics varied across languages and imaging findings. Key Points: Question Why has structured reporting not been widely adopted in radiology despite clear benefits and how can we improve this? Findings A locally hosted large language model successfully structured narrative reports, showing variation between languages and findings. Critical relevance Structured reporting provides many benefits, but its integration into the clinical routine is limited. Automating the extraction of structured information from radiology reports enables the capture of structured data while allowing the radiologist to maintain their reporting workflow.

Details

Original language	English
Article number	e232335
Pages (from-to)	2018–2029
Number of pages	12
Journal	European radiology
Volume	35
Issue number	4
Early online date	10 Oct 2024
Publication status	Published - Apr 2025
Peer-reviewed	Yes

External IDs

PubMed	39390261
ORCID	/0000-0002-3730-5348/work/198594627

Research Portal of the TU Dresden

Automatic structuring of radiology reports with on-premise open-source large language models

Contributors

Abstract

Details

External IDs

Keywords

ASJC Scopus subject areas

Keywords