Benchmarking vision-language models for diagnostics in emergency and critical care settings
Publikation: Beitrag in Fachzeitschrift › Forschungsartikel › Beigetragen › Begutachtung
Beitragende
Abstract
The applicability of vision-language models (VLMs) for acute care in emergency and intensive care units remains underexplored. Using a multimodal dataset of diagnostic questions involving medical images and clinical context, we benchmarked several small open-source VLMs against GPT-4o. While open models demonstrated limited diagnostic accuracy (up to 40.4%), GPT-4o significantly outperformed them (68.1%). Findings highlight the need for specialized training and optimization to improve open-source VLMs for acute care applications.
Details
| Originalsprache | Englisch |
|---|---|
| Aufsatznummer | 423 |
| Fachzeitschrift | npj digital medicine |
| Jahrgang | 8 |
| Ausgabenummer | 1 |
| Publikationsstatus | Veröffentlicht - Dez. 2025 |
| Peer-Review-Status | Ja |
Externe IDs
| ORCID | /0000-0002-3730-5348/work/198594679 |
|---|