Data Extraction from Free-Text Reports on Mechanical Thrombectomy in Acute Ischemic Stroke Using ChatGPT: A Retrospective Analysis

Nils C. Lehnen; Franziska Dorn; Isabella C. Wiest; Hanna Zimmermann; Alexander Radbruch; Jakob Nikolas Kather; Daniel Paech

doi:10.1148/radiol.232741

Data Extraction from Free-Text Reports on Mechanical Thrombectomy in Acute Ischemic Stroke Using ChatGPT: A Retrospective Analysis

Research output: Contribution to journal › Research article › Contributed › peer-review

Contributors

Nils C. Lehnen - , University of Bonn, German Center for Neurodegenerative Diseases (DZNE) (Author)
Franziska Dorn - , University of Bonn (Author)
Isabella C. Wiest - , Heidelberg University (Author)
Hanna Zimmermann - , Ludwig Maximilian University of Munich (Author)
Alexander Radbruch - , University of Bonn, German Center for Neurodegenerative Diseases (DZNE) (Author)
Jakob Nikolas Kather - , Else Kröner Fresenius Center for Digital Health (Author)
Daniel Paech - , University of Bonn, Harvard University (Author)

Abstract

Background: Procedural details of mechanical thrombectomy in patients with ischemic stroke are important predictors of clinical outcome and are collected for prospective studies or national stroke registries. To date, these data are collected manually by human readers, a labor-intensive task that is prone to errors. Purpose: To evaluate the use of the large language models (LLMs) GPT-4 and GPT-3.5 to extract data from neuroradiology reports on mechanical thrombectomy in patients with ischemic stroke. Materials and Methods: This retrospective study included consecutive reports from patients with ischemic stroke who underwent mechanical thrombectomy between November 2022 and September 2023 at institution 1 and between September 2016 and December 2019 at institution 2. A set of 20 reports was used to optimize the prompt, and the ability of the LLMs to extract procedural data from the reports was compared using the McNemar test. Data manually extracted by an interventional neuroradiologist served as the reference standard. Results: A total of 100 internal reports from 100 patients (mean age, 74.7 years ± 13.2 [SD]; 53 female) and 30 external reports from 30 patients (mean age, 72.7 years ± 13.5; 18 male) were included. All reports were successfully processed by GPT-4 and GPT-3.5. Of 2800 data entries, 2631 (94.0% [95% CI: 93.0, 94.8]; range per category, 61%–100%) data points were correctly extracted by GPT-4 without the need for further postprocessing. With 1788 of 2800 correct data entries, GPT-3.5 produced fewer correct data entries than did GPT-4 (63.9% [95% CI: 62.0, 65.6]; range per category, 14%–99%; P < .001). For the external reports, GPT-4 extracted 760 of 840 (90.5% [95% CI: 88.3, 92.4]) correct data entries, while GPT-3.5 extracted 539 of 840 (64.2% [95% CI: 60.8, 67.4]; P < .001). Conclusion: Compared with GPT-3.5, GPT-4 more frequently extracted correct procedural data from free-text reports on mechanical thrombectomy performed in patients with ischemic stroke.

Details

Original language	English
Article number	e232741
Journal	Radiology
Volume	311
Issue number	1
Early online date	16 Apr 2024
Publication status	Published - Apr 2024
Peer-reviewed	Yes

External IDs

PubMed	38625006

Keywords

ASJC Scopus subject areas

Radiology, Nuclear Medicine and Imaging

Keywords

Thrombectomy, Prospective Studies, Humans, Stroke/diagnostic imaging, Female, Male, Aged, Retrospective Studies, Ischemic Stroke/diagnostic imaging

Research Portal of the TU Dresden

Contributors

Abstract

Details

External IDs

Keywords

ASJC Scopus subject areas

Keywords