Multimodal Deep Learning for Integrating Chest Radiographs and Clinical Parameters: A Case for Transformers

Research output: Contribution to journalResearch articleContributedpeer-review

Contributors

  • Firas Khader - , RWTH Aachen University (Author)
  • Gustav Müller-Franzes - , RWTH Aachen University (Author)
  • Tianci Wang - , RWTH Aachen University (Author)
  • Tianyu Han - , RWTH Aachen University (Author)
  • Soroosh Tayebi Arasteh - , RWTH Aachen University (Author)
  • Christoph Haarburger - , Ocumeda GmbH (Author)
  • Johannes Stegmaier - , RWTH Aachen University (Author)
  • Keno Bressem - , Charité – Universitätsmedizin Berlin (Author)
  • Christiane Kuhl - , RWTH Aachen University (Author)
  • Sven Nebelung - , RWTH Aachen University (Author)
  • Jakob Nikolas Kather - , Else Kröner Fresenius Center for Digital Health, RWTH Aachen University, University of Leeds, National Center for Tumor Diseases (NCT) Heidelberg (Author)
  • Daniel Truhn - , RWTH Aachen University (Author)

Abstract

Background: Clinicians consider both imaging and nonimaging data when diagnosing diseases; however, current machine learning approaches primarily consider data from a single modality. Purpose: To develop a neural network architecture capable of integrating multimodal patient data and compare its performance to models incorporating a single modality for diagnosing up to 25 pathologic conditions. Materials and Methods: In this retrospective study, imaging and nonimaging patient data were extracted from the Medical Information Mart for Intensive Care (MIMIC) database and an internal database comprised of chest radiographs and clinical parameters in patients in the intensive care unit (ICU) (January 2008 to December 2020). The MIMIC and internal data sets were each split into training (n = 33 893, n = 28 809), validation (n = 740, n = 7203), and test (n = 1909, n = 9004) sets. A novel transformer-based neural network architecture was trained to diagnose up to 25 conditions using nonimaging data alone, imaging data alone, or multimodal data. Diagnostic performance was assessed using area under the receiver operating characteristic curve (AUC) analysis. Results: The MIMIC and internal data sets included 36 542 patients (mean age, 63 years ± 17 [SD]; 20 567 male patients) and 45 016 patients (mean age, 66 years ± 16; 27 577 male patients), respectively. The multimodal model showed improved diagnostic performance for all pathologic conditions. For the MIMIC data set, the mean AUC was 0.77 (95% CI: 0.77, 0.78) when both chest radiographs and clinical parameters were used, compared with 0.70 (95% CI: 0.69, 0.71; P < .001) for only chest radiographs and 0.72 (95% CI: 0.72, 0.73; P < .001) for only clinical parameters. These findings were confirmed on the internal data set. Conclusion: A model trained on imaging and nonimaging data outperformed models trained on only one type of data for diagnosing multiple diseases in patients in an ICU setting.

Details

Original languageEnglish
Article numbere230806
JournalRadiology
Volume309
Issue number1
Publication statusPublished - Oct 2023
Peer-reviewedYes

External IDs

PubMed 37787671

Keywords