Benchmarking weakly-supervised deep learning pipelines for whole slide classification in computational pathology

Narmin Ghaffari Laleh; Hannah Sophie Muti; Chiara Maria Lavinia Loeffler; Amelie Echle; Oliver Lester Saldanha; Faisal Mahmood; Ming Y. Lu; Christian Trautwein; Rupert Langer; Bastian Dislich; Roman D. Buelow; Heike Irmgard Grabsch; Hermann Brenner; Jenny Chang-Claude; Elizabeth Alwers; Titus J. Brinker; Firas Khader; Daniel Truhn; Nadine T. Gaisa; Peter Boor; Michael Hoffmeister; Volkmar Schulz; Jakob Nikolas Kather

doi:10.1016/j.media.2022.102474

Benchmarking weakly-supervised deep learning pipelines for whole slide classification in computational pathology

Research output: Contribution to journal › Research article › Contributed › peer-review

Contributors

Narmin Ghaffari Laleh - , RWTH Aachen University (Author)
Hannah Sophie Muti - , RWTH Aachen University (Author)
Chiara Maria Lavinia Loeffler - , RWTH Aachen University (Author)
Amelie Echle - , RWTH Aachen University (Author)
Oliver Lester Saldanha - , RWTH Aachen University (Author)
Faisal Mahmood - , Harvard University (Author)
Ming Y. Lu - , Harvard University (Author)
Christian Trautwein - , RWTH Aachen University (Author)
Rupert Langer - , Kepler University Hospital (Author)
Bastian Dislich - , University of Bern (Author)
Roman D. Buelow - , RWTH Aachen University (Author)
Heike Irmgard Grabsch - , Maastricht University, University of Leeds (Author)
Hermann Brenner - , German Cancer Research Center (DKFZ) (Author)
Jenny Chang-Claude - , German Cancer Research Center (DKFZ), University of Hamburg (Author)
Elizabeth Alwers - , German Cancer Research Center (DKFZ) (Author)
Titus J. Brinker - , German Cancer Research Center (DKFZ) (Author)
Firas Khader - , RWTH Aachen University (Author)
Daniel Truhn - , RWTH Aachen University (Author)
Nadine T. Gaisa - , RWTH Aachen University (Author)
Peter Boor - , RWTH Aachen University (Author)
Michael Hoffmeister - , German Cancer Research Center (DKFZ) (Author)
Volkmar Schulz - , RWTH Aachen University, Fraunhofer Institute for Digital Medicine, Hyperion Hybrid Imaging Systems GmbH (Author)
Jakob Nikolas Kather - , Else Kröner Fresenius Center for Digital Health, RWTH Aachen University, University of Leeds (Author)

Abstract

Artificial intelligence (AI) can extract visual information from histopathological slides and yield biological insight and clinical biomarkers. Whole slide images are cut into thousands of tiles and classification problems are often weakly-supervised: the ground truth is only known for the slide, not for every single tile. In classical weakly-supervised analysis pipelines, all tiles inherit the slide label while in multiple-instance learning (MIL), only bags of tiles inherit the label. However, it is still unclear how these widely used but markedly different approaches perform relative to each other. We implemented and systematically compared six methods in six clinically relevant end-to-end prediction tasks using data from N=2980 patients for training with rigorous external validation. We tested three classical weakly-supervised approaches with convolutional neural networks and vision transformers (ViT) and three MIL-based approaches with and without an additional attention module. Our results empirically demonstrate that histological tumor subtyping of renal cell carcinoma is an easy task in which all approaches achieve an area under the receiver operating curve (AUROC) of above 0.9. In contrast, we report significant performance differences for clinically relevant tasks of mutation prediction in colorectal, gastric, and bladder cancer. In these mutation prediction tasks, classical weakly-supervised workflows outperformed MIL-based weakly-supervised methods for mutation prediction, which is surprising given their simplicity. This shows that new end-to-end image analysis pipelines in computational pathology should be compared to classical weakly-supervised methods. Also, these findings motivate the development of new methods which combine the elegant assumptions of MIL with the empirically observed higher performance of classical weakly-supervised approaches. We make all source codes publicly available at https://github.com/KatherLab/HIA, allowing easy application of all methods to any similar task.

Details

Original language	English
Article number	102474
Journal	Medical Image Analysis
Volume	79
Publication status	Published - Jul 2022
Peer-reviewed	Yes

External IDs

PubMed	35588568

Keywords

Sustainable Development Goals

SDG 3 - Good Health and Well-being

ASJC Scopus subject areas

Keywords

Artificial intelligence, Computational pathology, Convolutional neural networks, Multiple-Instance Learning, Vision transformers, Weakly-supervised deep learning

Library keywords

004 Computer science
610 Medicine and health

Research Portal of the TU Dresden

Benchmarking weakly-supervised deep learning pipelines for whole slide classification in computational pathology

Contributors

Abstract

Details

External IDs

Keywords

Sustainable Development Goals

ASJC Scopus subject areas

Keywords

Library keywords

Related content

Erratum to ‘Benchmarking weakly-supervised deep learning pipelines for whole slide classification in computational pathology’ Medical Image Analysis, Volume 79, July 2022, 102474 (Medical Image Analysis (2022) 79, (S1361841522001219), (10.1016/j.media.2022.102474))