Metrics reloaded: recommendations for image analysis validation

Publikation: Beitrag in FachzeitschriftForschungsartikelBeigetragenBegutachtung

Beitragende

  • Deutsches Krebsforschungszentrum (DKFZ)
  • Universität Heidelberg
  • Johann Wolfgang Goethe-Universität Frankfurt am Main
  • Frankfurt Cancer Insititute
  • Imperial College London
  • Universität Duisburg-Essen
  • Masaryk University
  • Universität Bern
  • SimulaMet
  • University of Tromsø – The Arctic University of Norway
  • University College London
  • King's College London (KCL)
  • Universidad de Buenos Aires
  • McGill University
  • Indiana University-Purdue University Indianapolis
  • University of Pennsylvania
  • Holon Institute of Technology
  • European Federation for Medical Informatics
  • KU Leuven
  • IT University of Copenhagen
  • Broad Institute of Harvard University and MIT
  • University of Oxford
  • National Cancer Institute (NCI)
  • Ciudad Autónoma de Buenos Aires
  • Pompeu Fabra University
  • University of Adelaide
  • Fraunhofer Institute for Digital Medicine
  • Radboud University Nijmegen
  • Princess Margaret Cancer Centre
  • University of Toronto
  • Vector Institute
  • Université de Rennes 1
  • INSERM - Institut national de la santé et de la recherche médicale
  • Max-Delbrück-Centrum für Molekulare Medizin (MDC)
  • Universität Potsdam
  • Friedrich-Alexander-Universität Erlangen-Nürnberg
  • Institute of Image-Guided Surgery
  • Alphabet Inc.
  • Helmholtz AI
  • European Molecular Biology Laboratory (EMBL) Heidelberg
  • Stony Brook University
  • Vanderbilt University

Abstract

Increasing evidence shows that flaws in machine learning (ML) algorithm validation are an underestimated global problem. In biomedical image analysis, chosen performance metrics often do not reflect the domain interest, and thus fail to adequately measure scientific progress and hinder translation of ML techniques into practice. To overcome this, we created Metrics Reloaded, a comprehensive framework guiding researchers in the problem-aware selection of metrics. Developed by a large international consortium in a multistage Delphi process, it is based on the novel concept of a problem fingerprint—a structured representation of the given problem that captures all aspects that are relevant for metric selection, from the domain interest to the properties of the target structure(s), dataset and algorithm output. On the basis of the problem fingerprint, users are guided through the process of choosing and applying appropriate validation metrics while being made aware of potential pitfalls. Metrics Reloaded targets image analysis problems that can be interpreted as classification tasks at image, object or pixel level, namely image-level classification, object detection, semantic segmentation and instance segmentation tasks. To improve the user experience, we implemented the framework in the Metrics Reloaded online tool. Following the convergence of ML methodology across application domains, Metrics Reloaded fosters the convergence of validation methodology. Its applicability is demonstrated for various biomedical use cases.

Details

OriginalspracheEnglisch
Seiten (von - bis)195-212
Seitenumfang18
FachzeitschriftNature methods
Jahrgang21
Ausgabenummer2
PublikationsstatusVeröffentlicht - Feb. 2024
Peer-Review-StatusJa

Externe IDs

PubMed 38347141

Schlagworte

Schlagwörter

  • Algorithms, Image Processing, Computer-Assisted, Semantics, Machine Learning