Metrics reloaded: recommendations for image analysis validation

Collaborators; Robert Haase

doi:10.1038/s41592-023-02151-z

Metrics reloaded: recommendations for image analysis validation

Publikation: Beitrag in Fachzeitschrift › Forschungsartikel › Beigetragen › Begutachtung

Beitragende

Collaborators - (Autor:in)
Robert Haase - , Core Facility Bio-Bildanalyse, Exzellenzcluster PoL: Physik des Lebens, Zentrum für Systembiologie Dresden (CSBD), Universität Leipzig (Autor:in)

Deutsches Krebsforschungszentrum (DKFZ)
Universität Heidelberg
Johann Wolfgang Goethe-Universität Frankfurt am Main
Frankfurt Cancer Institute
Imperial College London
Universität Duisburg-Essen
Masaryk University
Universität Bern
SimulaMet
University of Tromsø – The Arctic University of Norway
University College London
King's College London (KCL)
Universidad de Buenos Aires
McGill University
Indiana University-Purdue University Indianapolis
University of Pennsylvania
Holon Institute of Technology
European Federation for Medical Informatics
KU Leuven
IT University of Copenhagen
Broad Institute of Harvard University and MIT
University of Oxford
National Cancer Institute (NCI)
Ciudad Autónoma de Buenos Aires
Universitat Pompeu Fabra
University of Adelaide
Fraunhofer-Institut für Digitale Medizin MEVIS
Radboud University Nijmegen
Princess Margaret Cancer Centre
University of Toronto
Vector Institute
Université de Rennes 1
INSERM - Institut national de la santé et de la recherche médicale
Max-Delbrück-Centrum für Molekulare Medizin (MDC)
Universität Potsdam
Friedrich-Alexander-Universität Erlangen-Nürnberg
Institute of Image-Guided Surgery
Alphabet Inc.
Helmholtz Artificial Intelligence Cooperation Unit
European Molecular Biology Laboratory (EMBL) Heidelberg
Stony Brook University
Vanderbilt University

Abstract

Increasing evidence shows that flaws in machine learning (ML) algorithm validation are an underestimated global problem. In biomedical image analysis, chosen performance metrics often do not reflect the domain interest, and thus fail to adequately measure scientific progress and hinder translation of ML techniques into practice. To overcome this, we created Metrics Reloaded, a comprehensive framework guiding researchers in the problem-aware selection of metrics. Developed by a large international consortium in a multistage Delphi process, it is based on the novel concept of a problem fingerprint—a structured representation of the given problem that captures all aspects that are relevant for metric selection, from the domain interest to the properties of the target structure(s), dataset and algorithm output. On the basis of the problem fingerprint, users are guided through the process of choosing and applying appropriate validation metrics while being made aware of potential pitfalls. Metrics Reloaded targets image analysis problems that can be interpreted as classification tasks at image, object or pixel level, namely image-level classification, object detection, semantic segmentation and instance segmentation tasks. To improve the user experience, we implemented the framework in the Metrics Reloaded online tool. Following the convergence of ML methodology across application domains, Metrics Reloaded fosters the convergence of validation methodology. Its applicability is demonstrated for various biomedical use cases.

Details

Originalsprache	Englisch
Seiten (von - bis)	195-212; 13 ungez.
Seitenumfang	30
Fachzeitschrift	Nature methods
Jahrgang	21
Ausgabenummer	2
Publikationsstatus	Veröffentlicht - 12 Feb. 2024
Peer-Review-Status	Ja

Externe IDs

PubMed	38347141

Schlagworte

ASJC Scopus Sachgebiete

Schlagwörter

Algorithms, Image Processing, Computer-Assisted, Semantics, Machine Learning

Forschungsportal der TU Dresden