Transformer-based biomarker prediction from colorectal cancer histology: A large-scale multicentric study

Research output: Contribution to journalResearch articleContributedpeer-review

Contributors

  • Helmholtz Zentrum München - German Research Center for Environmental Health
  • University Medical Center Mainz
  • Maastricht University
  • German Cancer Research Center (DKFZ)
  • Johannes Kepler University Linz
  • University of Oxford
  • University of Antwerp
  • City of Hope Comprehensive Cancer Center - Duarte
  • Technion-Israel Institute of Technology
  • University of New South Wales
  • Khalkhal School of Medical Sciences
  • University of Sydney
  • University Hospitals Birmingham NHS Foundation Trust
  • Leeds Teaching Hospitals NHS Trust
  • University of Birmingham
  • University of Zurich
  • University of Glasgow
  • State Vocational Colleges at the University Hospital Erlangen
  • Jinggangshan University
  • Guangzhou Medical University
  • Queen's University Belfast
  • University Hospital Aachen
  • Technical University of Munich
  • Department of Child and Adolescent Psychiatry and Psychotherapy

Abstract

Deep learning (DL) can accelerate the prediction of prognostic biomarkers from routine pathology slides in colorectal cancer (CRC). However, current approaches rely on convolutional neural networks (CNNs) and have mostly been validated on small patient cohorts. Here, we develop a new transformer-based pipeline for end-to-end biomarker prediction from pathology slides by combining a pre-trained transformer encoder with a transformer network for patch aggregation. Our transformer-based approach substantially improves the performance, generalizability, data efficiency, and interpretability as compared with current state-of-the-art algorithms. After training and evaluating on a large multicenter cohort of over 13,000 patients from 16 colorectal cancer cohorts, we achieve a sensitivity of 0.99 with a negative predictive value of over 0.99 for prediction of microsatellite instability (MSI) on surgical resection specimens. We demonstrate that resection specimen-only training reaches clinical-grade performance on endoscopic biopsy tissue, solving a long-standing diagnostic problem.

Details

Original languageEnglish
Article numbere4
Pages (from-to)1650-1661
Number of pages13
JournalCancer cell
Volume41
Issue number9
Publication statusPublished - 11 Sept 2023
Peer-reviewedYes

External IDs

PubMedCentral PMC10507381
Scopus 85169513346

Keywords

Sustainable Development Goals

Keywords

  • Humans, Algorithms, Biomarkers, Biopsy, Microsatellite Instability, Colorectal Neoplasms/genetics