Using histopathology latent diffusion models as privacy-preserving dataset augmenters improves downstream classification performance

Jan M. Niehues; Gustav Müller-Franzes; Yoni Schirris; Sophia Janine Wagner; Michael Jendrusch; Matthias Kloor; Alexander T. Pearson; Hannah Sophie Muti; Katherine J. Hewitt; Gregory P. Veldhuizen; Laura Zigutyte; Daniel Truhn; Jakob Nikolas Kather

doi:10.1016/j.compbiomed.2024.108410

Using histopathology latent diffusion models as privacy-preserving dataset augmenters improves downstream classification performance

Publikation: Beitrag in Fachzeitschrift › Forschungsartikel › Beigetragen › Begutachtung

Beitragende

Jan M. Niehues - , Else Kröner Fresenius Zentrum für Digitale Gesundheit (Autor:in)
Gustav Müller-Franzes - , Rheinisch-Westfälische Technische Hochschule Aachen (Autor:in)
Yoni Schirris - , Technische Universität Dresden, Netherlands Cancer Institute, University of Amsterdam (Autor:in)
Sophia Janine Wagner - , Technische Universität Dresden, Helmholtz Zentrum München - Deutsches Forschungszentrum für Gesundheit und Umwelt, Technische Universität München (Autor:in)
Michael Jendrusch - , Universität Heidelberg (Autor:in)
Matthias Kloor - , Universität Heidelberg (Autor:in)
Alexander T. Pearson - , The University of Chicago (Autor:in)
Hannah Sophie Muti - , Else Kröner Fresenius Zentrum für Digitale Gesundheit, Rheinisch-Westfälische Technische Hochschule Aachen (Autor:in)
Katherine J. Hewitt - , Else Kröner Fresenius Zentrum für Digitale Gesundheit, Rheinisch-Westfälische Technische Hochschule Aachen (Autor:in)
Gregory P. Veldhuizen - , Else Kröner Fresenius Zentrum für Digitale Gesundheit, Rheinisch-Westfälische Technische Hochschule Aachen (Autor:in)
Laura Zigutyte - , Else Kröner Fresenius Zentrum für Digitale Gesundheit (Autor:in)
Daniel Truhn - , Rheinisch-Westfälische Technische Hochschule Aachen (Autor:in)
Jakob Nikolas Kather - , Medizinische Klinik und Poliklinik I, Else Kröner Fresenius Zentrum für Digitale Gesundheit, University of Leeds, Universität Heidelberg (Autor:in)

Abstract

Latent diffusion models (LDMs) have emerged as a state-of-the-art image generation method, outperforming previous Generative Adversarial Networks (GANs) in terms of training stability and image quality. In computational pathology, generative models are valuable for data sharing and data augmentation. However, the impact of LDM-generated images on histopathology tasks compared to traditional GANs has not been systematically studied. We trained three LDMs and a styleGAN2 model on histology tiles from nine colorectal cancer (CRC) tissue classes. The LDMs include 1) a fine-tuned version of stable diffusion v1.4, 2) a Kullback-Leibler (KL)-autoencoder (KLF8-DM), and 3) a vector quantized (VQ)-autoencoder deploying LDM (VQF8-DM). We assessed image quality through expert ratings, dimensional reduction methods, distribution similarity measures, and their impact on training a multiclass tissue classifier. Additionally, we investigated image memorization in the KLF8-DM and styleGAN2 models. All models provided a high image quality, with the KLF8-DM achieving the best Frechet Inception Distance (FID) and expert rating scores for complex tissue classes. For simpler classes, the VQF8-DM and styleGAN2 models performed better. Image memorization was negligible for both styleGAN2 and KLF8-DM models. Classifiers trained on a mix of KLF8-DM generated and real images achieved a 4% improvement in overall classification accuracy, highlighting the usefulness of these images for dataset augmentation. Our systematic study of generative methods showed that KLF8-DM produces the highest quality images with negligible image memorization. The higher classifier performance in the generatively augmented dataset suggests that this augmentation technique can be employed to enhance histopathology classifiers for various tasks.

Details

Originalsprache	Englisch
Aufsatznummer	108410
Seitenumfang	12
Fachzeitschrift	Computers in biology and medicine
Jahrgang	175 (2024)
Frühes Online-Datum	4 Apr. 2024
Publikationsstatus	Veröffentlicht - Juni 2024
Peer-Review-Status	Ja

Externe IDs

PubMed	38678938
ORCID	/0009-0000-2447-2959/work/175771149
ORCID	/0000-0002-3730-5348/work/198594510

Schlagworte

Ziele für nachhaltige Entwicklung

SDG 3 – Gute Gesundheit und Wohlergehen

ASJC Scopus Sachgebiete

Gesundheitsinformatik
Angewandte Informatik

Schlagwörter

Artificial intelligence, Colorectal cancer, Computational pathology, Diffusion models, Generative adversarial networks, Generative models, Algorithms, Colorectal Neoplasms/pathology, Humans, Image Processing, Computer-Assisted/methods, Image Interpretation, Computer-Assisted/methods

Forschungsportal der TU Dresden