Guidelines for Reporting Studies on Large Language Models in Radiology: An International Delphi Expert Survey

Publikation: Beitrag in FachzeitschriftÜbersichtsartikel (Review)BeigetragenBegutachtung

Beitragende

  • Jonathan Kottlors - , Universität zu Köln (Autor:in)
  • Andra Iza Iuga - , Universität zu Köln (Autor:in)
  • Christian Bluethgen - , Universität Zürich (Autor:in)
  • Keno Bressem - , Technische Universität München (Autor:in)
  • Jakob Nikolas Kather - , Medizinische Klinik und Poliklinik I, Else Kröner Fresenius Zentrum für Digitale Gesundheit, Nationales Zentrum für Tumorerkrankungen (NCT) Heidelberg (Autor:in)
  • Linda Moy - , New York University (Autor:in)
  • Christoph Wald - , Lahey Clinic Medical Center, Burlington (Autor:in)
  • Wei Wang - , Sun Yat-Sen University (Autor:in)
  • Tianming Liu - , University of Georgia (Autor:in)
  • Erik Ranschaert - , St. Nikolaus Hospital, Ghent University (Autor:in)
  • Thomas Dratsch - , Universität zu Köln (Autor:in)
  • Jens Kleesiek - , Universitätsklinikum Essen (Autor:in)
  • Roman Johannes Gertz - , Universität zu Köln (Autor:in)
  • Pranav Rajpurkar - , Harvard University (Autor:in)
  • Arash Bedayat - , University of California at Los Angeles (Autor:in)
  • Matthias A. Fink - , Universität Heidelberg (Autor:in)
  • Almut Zeeck - , Albert-Ludwigs-Universität Freiburg (Autor:in)
  • Akshay Chaudhari - , Stanford University (Autor:in)
  • Tarik Alkasab - , Massachusetts General Hospital (Autor:in)
  • Honghan Wu - , University of Glasgow, University College London (Autor:in)
  • Felix Nensa - , Universitätsklinikum Essen (Autor:in)
  • Benyou Wang - , The Chinese University of Hong Kong, Shenzhen (Autor:in)
  • Nils Große Hokamp - , Universität zu Köln (Autor:in)
  • Kai Roman Laukamp - , Universität zu Köln (Autor:in)
  • Thorsten Persigehl - , Universität zu Köln (Autor:in)
  • David Maintz - , Universität zu Köln (Autor:in)
  • Daniel Truhn - , Rheinisch-Westfälische Technische Hochschule Aachen (Autor:in)
  • Simon Lennartz - , Universität zu Köln (Autor:in)

Abstract

Large language models (LLMs) have transformative potential in radiology, including textual summaries, diagnostic decision support, proofreading, and image analysis. However, the rapid increase in studies investigating these models, along with the lack of standardized LLM-specific reporting practices, affects reproducibility, reliability, and clinical applicability. To address this, reporting guidelines for LLM studies in radiology were developed using a two-step process. First, a systematic review of LLM studies in radiology was conducted across PubMed, IEEE Xplore, and the ACM Digital Library, covering publications between May 2023 and March 2024. Of 511 screened studies, 57 were included to identify relevant aspects for the guidelines. Then, in a Delphi process, 20 international experts developed the final list of items for inclusion. Items consented as relevant were summarized into a structured checklist containing 32 items across six key categories: general information and data input; prompting and fine-tuning; performance metrics; ethics and data transparency; implementation, risks, and limitations; and further/optional aspects. The final FLAIR (Framework for LLM Assessment in Radiology) checklist aims to standardize reporting of LLM studies in radiology, fostering transparency, reproducibility, comparability, and clinical applicability to enhance clinical translation and patient care.

Details

OriginalspracheEnglisch
Aufsatznummere250913
FachzeitschriftRadiology
Jahrgang318
Ausgabenummer2
PublikationsstatusVeröffentlicht - Feb. 2026
Peer-Review-StatusJa

Externe IDs

PubMed 41631991
ORCID /0000-0002-3730-5348/work/211722515

Schlagworte