Data Extraction from Free-Text Reports on Mechanical Thrombectomy in Acute Ischemic Stroke Using ChatGPT: A Retrospective Analysis

Research output: Contribution to journalResearch articleContributedpeer-review

Contributors

  • Nils C. Lehnen - , University of Bonn, German Center for Neurodegenerative Diseases (DZNE) (Author)
  • Franziska Dorn - , University of Bonn (Author)
  • Isabella C. Wiest - , Heidelberg University  (Author)
  • Hanna Zimmermann - , Ludwig Maximilian University of Munich (Author)
  • Alexander Radbruch - , University of Bonn, German Center for Neurodegenerative Diseases (DZNE) (Author)
  • Jakob Nikolas Kather - , Else Kröner Fresenius Center for Digital Health (Author)
  • Daniel Paech - , University of Bonn, Harvard University (Author)

Abstract

Background: Procedural details of mechanical thrombectomy in patients with ischemic stroke are important predictors of clinical outcome and are collected for prospective studies or national stroke registries. To date, these data are collected manually by human readers, a labor-intensive task that is prone to errors. Purpose: To evaluate the use of the large language models (LLMs) GPT-4 and GPT-3.5 to extract data from neuroradiology reports on mechanical thrombectomy in patients with ischemic stroke. Materials and Methods: This retrospective study included consecutive reports from patients with ischemic stroke who underwent mechanical thrombectomy between November 2022 and September 2023 at institution 1 and between September 2016 and December 2019 at institution 2. A set of 20 reports was used to optimize the prompt, and the ability of the LLMs to extract procedural data from the reports was compared using the McNemar test. Data manually extracted by an interventional neuroradiologist served as the reference standard. Results: A total of 100 internal reports from 100 patients (mean age, 74.7 years ± 13.2 [SD]; 53 female) and 30 external reports from 30 patients (mean age, 72.7 years ± 13.5; 18 male) were included. All reports were successfully processed by GPT-4 and GPT-3.5. Of 2800 data entries, 2631 (94.0% [95% CI: 93.0, 94.8]; range per category, 61%–100%) data points were correctly extracted by GPT-4 without the need for further postprocessing. With 1788 of 2800 correct data entries, GPT-3.5 produced fewer correct data entries than did GPT-4 (63.9% [95% CI: 62.0, 65.6]; range per category, 14%–99%; P < .001). For the external reports, GPT-4 extracted 760 of 840 (90.5% [95% CI: 88.3, 92.4]) correct data entries, while GPT-3.5 extracted 539 of 840 (64.2% [95% CI: 60.8, 67.4]; P < .001). Conclusion: Compared with GPT-3.5, GPT-4 more frequently extracted correct procedural data from free-text reports on mechanical thrombectomy performed in patients with ischemic stroke.

Details

Original languageEnglish
Article numbere232741
JournalRadiology
Volume311
Issue number1
Early online date16 Apr 2024
Publication statusPublished - Apr 2024
Peer-reviewedYes

External IDs

PubMed 38625006

Keywords

Keywords

  • Thrombectomy, Prospective Studies, Humans, Stroke/diagnostic imaging, Female, Male, Aged, Retrospective Studies, Ischemic Stroke/diagnostic imaging