Current applications and challenges in large language models for patient care: a systematic review

Felix Busch; Lena Hoffmann; Christopher Rueger; Elon H.C. van Dijk; Rawen Kader; Esteban Ortiz-Prado; Marcus R. Makowski; Luca Saba; Martin Hadamitzky; Jakob Nikolas Kather; Daniel Truhn; Renato Cuocolo; Lisa C. Adams; Keno K. Bressem

doi:10.1038/s43856-024-00717-2

Current applications and challenges in large language models for patient care: a systematic review

Publikation: Beitrag in Fachzeitschrift › Forschungsartikel › Beigetragen › Begutachtung

Beitragende

Felix Busch - , Technische Universität München (Autor:in)
Lena Hoffmann - , Charité – Universitätsmedizin Berlin (Autor:in)
Christopher Rueger - , Charité – Universitätsmedizin Berlin (Autor:in)
Elon H.C. van Dijk - , Leiden University, Sir Charles Gairdner Hospital (Autor:in)
Rawen Kader - , University College London (Autor:in)
Esteban Ortiz-Prado - , Universidad de las Américas - Ecuador (Autor:in)
Marcus R. Makowski - , Technische Universität München (Autor:in)
Luca Saba - , University Hospital of Cagliari (Autor:in)
Martin Hadamitzky - , Technische Universität München (Autor:in)
Jakob Nikolas Kather - , Else Kröner Fresenius Zentrum für Digitale Gesundheit, Nationales Zentrum für Tumorerkrankungen (NCT) Heidelberg (Autor:in)
Daniel Truhn - , Rheinisch-Westfälische Technische Hochschule Aachen (Autor:in)
Renato Cuocolo - , University of Salerno (Autor:in)
Lisa C. Adams - , Technische Universität München (Autor:in)
Keno K. Bressem - , Technische Universität München (Autor:in)

Abstract

Background: The introduction of large language models (LLMs) into clinical practice promises to improve patient education and empowerment, thereby personalizing medical care and broadening access to medical knowledge. Despite the popularity of LLMs, there is a significant gap in systematized information on their use in patient care. Therefore, this systematic review aims to synthesize current applications and limitations of LLMs in patient care. Methods: We systematically searched 5 databases for qualitative, quantitative, and mixed methods articles on LLMs in patient care published between 2022 and 2023. From 4349 initial records, 89 studies across 29 medical specialties were included. Quality assessment was performed using the Mixed Methods Appraisal Tool 2018. A data-driven convergent synthesis approach was applied for thematic syntheses of LLM applications and limitations using free line-by-line coding in Dedoose. Results: We show that most studies investigate Generative Pre-trained Transformers (GPT)-3.5 (53.2%, n = 66 of 124 different LLMs examined) and GPT-4 (26.6%, n = 33/124) in answering medical questions, followed by patient information generation, including medical text summarization or translation, and clinical documentation. Our analysis delineates two primary domains of LLM limitations: design and output. Design limitations include 6 second-order and 12 third-order codes, such as lack of medical domain optimization, data transparency, and accessibility issues, while output limitations include 9 second-order and 32 third-order codes, for example, non-reproducibility, non-comprehensiveness, incorrectness, unsafety, and bias. Conclusions: This review systematically maps LLM applications and limitations in patient care, providing a foundational framework and taxonomy for their implementation and evaluation in healthcare settings.

Details

Originalsprache	Englisch
Aufsatznummer	26
Seitenumfang	13
Fachzeitschrift	Communications medicine
Jahrgang	5 (2025)
Ausgabenummer	1
Publikationsstatus	Veröffentlicht - 21 Jan. 2025
Peer-Review-Status	Ja

Externe IDs

PubMed	39838160
Mendeley	c38a0e4c-3cbc-3c1f-8105-4f703ddaacda
ORCID	/0000-0002-3730-5348/work/198594664

Schlagworte

Ziele für nachhaltige Entwicklung

SDG 3 – Gute Gesundheit und Wohlergehen

ASJC Scopus Sachgebiete

Öffentliche Gesundheit, Umwelt- und Arbeitsmedizin
Innere Medizin
Epidemiologie
Medizin (sonstige)
Beurteilung und Diagnose

Forschungsportal der TU Dresden