A multimodal comparison of latent denoising diffusion probabilistic models and generative adversarial networks for medical image synthesis

Publikation: Beitrag in FachzeitschriftForschungsartikelBeigetragenBegutachtung

Beitragende

  • Gustav Müller-Franzes - , Universitätsklinikum Aachen (Autor:in)
  • Jan Moritz Niehues - , Universitätsklinikum Aachen (Autor:in)
  • Firas Khader - , Universitätsklinikum Aachen (Autor:in)
  • Soroosh Tayebi Arasteh - , Universitätsklinikum Aachen (Autor:in)
  • Christoph Haarburger - , Ocumeda GmbH (Autor:in)
  • Christiane Kuhl - , Universitätsklinikum Aachen (Autor:in)
  • Tianci Wang - , Universitätsklinikum Aachen (Autor:in)
  • Tianyu Han - , Universitätsklinikum Aachen (Autor:in)
  • Teresa Nolte - , Universitätsklinikum Aachen (Autor:in)
  • Sven Nebelung - , Universitätsklinikum Aachen (Autor:in)
  • Jakob Nikolas Kather - , Else Kröner Fresenius Zentrum für Digitale Gesundheit, Universitätsklinikum Aachen (Autor:in)
  • Daniel Truhn - , Universitätsklinikum Aachen (Autor:in)

Abstract

Although generative adversarial networks (GANs) can produce large datasets, their limited diversity and fidelity have been recently addressed by denoising diffusion probabilistic models, which have demonstrated superiority in natural image synthesis. In this study, we introduce Medfusion, a conditional latent DDPM designed for medical image generation, and evaluate its performance against GANs, which currently represent the state-of-the-art. Medfusion was trained and compared with StyleGAN-3 using fundoscopy images from the AIROGS dataset, radiographs from the CheXpert dataset, and histopathology images from the CRCDX dataset. Based on previous studies, Progressively Growing GAN (ProGAN) and Conditional GAN (cGAN) were used as additional baselines on the CheXpert and CRCDX datasets, respectively. Medfusion exceeded GANs in terms of diversity (recall), achieving better scores of 0.40 compared to 0.19 in the AIROGS dataset, 0.41 compared to 0.02 (cGAN) and 0.24 (StyleGAN-3) in the CRMDX dataset, and 0.32 compared to 0.17 (ProGAN) and 0.08 (StyleGAN-3) in the CheXpert dataset. Furthermore, Medfusion exhibited equal or higher fidelity (precision) across all three datasets. Our study shows that Medfusion constitutes a promising alternative to GAN-based models for generating high-quality medical images, leading to improved diversity and less artifacts in the generated images.

Details

OriginalspracheEnglisch
Aufsatznummer12098
Seitenumfang10
FachzeitschriftScientific reports
Jahrgang13 (2023)
Ausgabenummer1
PublikationsstatusVeröffentlicht - 26 Juli 2023
Peer-Review-StatusJa

Externe IDs

PubMed 37495660
PubMedCentral PMC10372018

Schlagworte

ASJC Scopus Sachgebiete

Schlagwörter

  • Artifacts, Diffusion, Mental Recall, Models, Statistical, Ophthalmoscopy, Image Processing, Computer-Assisted

Bibliotheksschlagworte