Metrics reloaded: recommendations for image analysis validation

Collaborators; Robert Haase

doi:10.1038/s41592-023-02151-z

Metrics reloaded: recommendations for image analysis validation

Research output: Contribution to journal › Research article › Contributed › peer-review

Contributors

Collaborators - (Author)
Robert Haase - , Core Facility Bio-image Analysis, Clusters of Excellence PoL: Physics of Life, Center for Systems Biology Dresden (CSBD), Leipzig University (Author)

German Cancer Research Center (DKFZ)
Heidelberg University
Goethe University Frankfurt a.M.
Frankfurt Cancer Insititute
Imperial College London
University of Duisburg-Essen
Masaryk University
University of Bern
SimulaMet
University of Tromsø – The Arctic University of Norway
University College London
King's College London (KCL)
Universidad de Buenos Aires
McGill University
Indiana University-Purdue University Indianapolis
University of Pennsylvania
Holon Institute of Technology
European Federation for Medical Informatics
KU Leuven
IT University of Copenhagen
Broad Institute of Harvard University and MIT
University of Oxford
National Cancer Institute (NCI)
Ciudad Autónoma de Buenos Aires
Pompeu Fabra University
University of Adelaide
Fraunhofer Institute for Digital Medicine
Radboud University Nijmegen
Princess Margaret Cancer Centre
University of Toronto
Vector Institute
Université de Rennes 1
INSERM - Institut national de la santé et de la recherche médicale
Max Delbrück Center for Molecular Medicine (MDC)
University of Potsdam
Friedrich-Alexander University Erlangen-Nürnberg
Institute of Image-Guided Surgery
Alphabet Inc.
Helmholtz Artificial Intelligence Cooperation Unit
European Molecular Biology Laboratory (EMBL) Heidelberg
Stony Brook University
Vanderbilt University

Abstract

Increasing evidence shows that flaws in machine learning (ML) algorithm validation are an underestimated global problem. In biomedical image analysis, chosen performance metrics often do not reflect the domain interest, and thus fail to adequately measure scientific progress and hinder translation of ML techniques into practice. To overcome this, we created Metrics Reloaded, a comprehensive framework guiding researchers in the problem-aware selection of metrics. Developed by a large international consortium in a multistage Delphi process, it is based on the novel concept of a problem fingerprint—a structured representation of the given problem that captures all aspects that are relevant for metric selection, from the domain interest to the properties of the target structure(s), dataset and algorithm output. On the basis of the problem fingerprint, users are guided through the process of choosing and applying appropriate validation metrics while being made aware of potential pitfalls. Metrics Reloaded targets image analysis problems that can be interpreted as classification tasks at image, object or pixel level, namely image-level classification, object detection, semantic segmentation and instance segmentation tasks. To improve the user experience, we implemented the framework in the Metrics Reloaded online tool. Following the convergence of ML methodology across application domains, Metrics Reloaded fosters the convergence of validation methodology. Its applicability is demonstrated for various biomedical use cases.

Details

Original language	English
Pages (from-to)	195-212
Number of pages	18
Journal	Nature methods
Volume	21
Issue number	2
Publication status	Published - Feb 2024
Peer-reviewed	Yes

External IDs

PubMed	38347141

Keywords

ASJC Scopus subject areas

Keywords

Algorithms, Image Processing, Computer-Assisted, Semantics, Machine Learning

Research Portal of the TU Dresden

Contributors

Abstract

Details

External IDs

Keywords

ASJC Scopus subject areas

Keywords