Detection of suicidality from medical text using privacy-preserving large language models

Isabella Catharina Wiest; Falk Gerrik Verhees; Dyke Ferber; JieFu Zhu; Michael Bauer; Ute Lewitzka; Andrea Pfennig; Pavol Mikolas; Jakob Nikolas Kather

doi:10.1192/bjp.2024.134

Detection of suicidality from medical text using privacy-preserving large language models

Research output: Contribution to journal › Review article › Contributed › peer-review

Contributors

Isabella Catharina Wiest - , Else Kröner Fresenius Center for Digital Health, Universitätsmedizin Mannheim (Joint first author)
Falk Gerrik Verhees - , Department of Psychiatry and Psychotherapy (Joint first author)
Dyke Ferber - , Else Kröner Fresenius Center for Digital Health, National Center for Tumor Diseases (NCT) Heidelberg, University Hospital Heidelberg (Author)
JieFu Zhu - , Else Kröner Fresenius Center for Digital Health (Author)
Michael Bauer - , Department of Psychiatry and Psychotherapy (Author)
Ute Lewitzka - , Department of Psychiatry and Psychotherapy (Author)
Andrea Pfennig - , Department of Psychiatry and Psychotherapy (Author)
Pavol Mikolas - , Department of Psychiatry and Psychotherapy (Joint last author)
Jakob Nikolas Kather - , Else Kröner Fresenius Center for Digital Health, Department of Internal Medicine I, National Center for Tumor Diseases (NCT) Heidelberg, University Hospital Heidelberg (Joint last author)

Abstract

Background
Attempts to use artificial intelligence (AI) in psychiatric disorders show moderate success, highlighting the potential of incorporating information from clinical assessments to improve the models. This study focuses on using large language models (LLMs) to detect suicide risk from medical text in psychiatric care.

Aims
To extract information about suicidality status from the admission notes in electronic health records (EHRs) using privacy-sensitive, locally hosted LLMs, specifically evaluating the efficacy of Llama-2 models.

Method
We compared the performance of several variants of the open source LLM Llama-2 in extracting suicidality status from 100 psychiatric reports against a ground truth defined by human experts, assessing accuracy, sensitivity, specificity and F1 score across different prompting strategies.

Results
A German fine-tuned Llama-2 model showed the highest accuracy (87.5%), sensitivity (83.0%) and specificity (91.8%) in identifying suicidality, with significant improvements in sensitivity and specificity across various prompt designs.

Conclusions
The study demonstrates the capability of LLMs, particularly Llama-2, in accurately extracting information on suicidality from psychiatric records while preserving data privacy. This suggests their application in surveillance systems for psychiatric emergencies and improving the clinical management of suicidality by improving systematic quality control and research.

Details

Original language	English
Pages (from-to)	532-537
Number of pages	6
Journal	British Journal of Psychiatry
Volume	225
Issue number	6
Early online date	5 Nov 2024
Publication status	Published - 1 Dec 2024
Peer-reviewed	Yes

External IDs

ORCID	/0000-0002-3415-5583/work/171553716
ORCID	/0000-0002-2666-859X/work/171553753
ORCID	/0000-0002-3974-7115/work/171553864
ORCID	/0000-0002-6808-2968/work/171554061
Scopus	85208752532
PubMed	39497458

Keywords

Sustainable Development Goals

SDG 3 - Good Health and Well-being