A multimodal comparison of latent denoising diffusion probabilistic models and generative adversarial networks for medical image synthesis

Gustav Müller-Franzes; Jan Moritz Niehues; Firas Khader; Soroosh Tayebi Arasteh; Christoph Haarburger; Christiane Kuhl; Tianci Wang; Tianyu Han; Teresa Nolte; Sven Nebelung; Jakob Nikolas Kather; Daniel Truhn

doi:10.1038/s41598-023-39278-0

A multimodal comparison of latent denoising diffusion probabilistic models and generative adversarial networks for medical image synthesis

Research output: Contribution to journal › Research article › Contributed › peer-review

Contributors

Gustav Müller-Franzes - , University Hospital Aachen (Author)
Jan Moritz Niehues - , University Hospital Aachen (Author)
Firas Khader - , University Hospital Aachen (Author)
Soroosh Tayebi Arasteh - , University Hospital Aachen (Author)
Christoph Haarburger - , Ocumeda GmbH (Author)
Christiane Kuhl - , University Hospital Aachen (Author)
Tianci Wang - , University Hospital Aachen (Author)
Tianyu Han - , University Hospital Aachen (Author)
Teresa Nolte - , University Hospital Aachen (Author)
Sven Nebelung - , University Hospital Aachen (Author)
Jakob Nikolas Kather - , Else Kröner Fresenius Center for Digital Health, University Hospital Aachen (Author)
Daniel Truhn - , University Hospital Aachen (Author)

Abstract

Although generative adversarial networks (GANs) can produce large datasets, their limited diversity and fidelity have been recently addressed by denoising diffusion probabilistic models, which have demonstrated superiority in natural image synthesis. In this study, we introduce Medfusion, a conditional latent DDPM designed for medical image generation, and evaluate its performance against GANs, which currently represent the state-of-the-art. Medfusion was trained and compared with StyleGAN-3 using fundoscopy images from the AIROGS dataset, radiographs from the CheXpert dataset, and histopathology images from the CRCDX dataset. Based on previous studies, Progressively Growing GAN (ProGAN) and Conditional GAN (cGAN) were used as additional baselines on the CheXpert and CRCDX datasets, respectively. Medfusion exceeded GANs in terms of diversity (recall), achieving better scores of 0.40 compared to 0.19 in the AIROGS dataset, 0.41 compared to 0.02 (cGAN) and 0.24 (StyleGAN-3) in the CRMDX dataset, and 0.32 compared to 0.17 (ProGAN) and 0.08 (StyleGAN-3) in the CheXpert dataset. Furthermore, Medfusion exhibited equal or higher fidelity (precision) across all three datasets. Our study shows that Medfusion constitutes a promising alternative to GAN-based models for generating high-quality medical images, leading to improved diversity and less artifacts in the generated images.

Details

Original language	English
Article number	12098
Number of pages	10
Journal	Scientific reports
Volume	13 (2023)
Issue number	1
Publication status	Published - 26 Jul 2023
Peer-reviewed	Yes

External IDs

PubMed	37495660
PubMedCentral	PMC10372018

Keywords

ASJC Scopus subject areas

Multidisciplinary

Keywords

Artifacts, Diffusion, Mental Recall, Models, Statistical, Ophthalmoscopy, Image Processing, Computer-Assisted

Library keywords

610 Medicine and health

Research Portal of the TU Dresden

Contributors

Abstract

Details

External IDs

Keywords

ASJC Scopus subject areas

Keywords

Library keywords