PitVis-2023 challenge: Workflow recognition in videos of endoscopic pituitary surgery

Publikation: Beitrag in FachzeitschriftKurze Umfrage/ÜbersichtsartikelBeigetragenBegutachtung

Beitragende

  • Adrito Das - , University College London (Autor:in)
  • Danyal Z. Khan - , University College London (Autor:in)
  • Dimitrios Psychogyios - , University College London (Autor:in)
  • Yitong Zhang - , University College London (Autor:in)
  • John G. Hanrahan - , University College London (Autor:in)
  • Francisco Vasconcelos - , University College London (Autor:in)
  • You Pang - , CAS - Chinese Academy of Sciences (Autor:in)
  • Zhen Chen - , CAS - Chinese Academy of Sciences (Autor:in)
  • Jinlin Wu - , CAS - Chinese Academy of Sciences (Autor:in)
  • Xiaoyang Zou - , Shanghai Jiao Tong University (Autor:in)
  • Guoyan Zheng - , Shanghai Jiao Tong University (Autor:in)
  • Abdul Qayyum - , Imperial College London (Autor:in)
  • Moona Mazher - , University College London (Autor:in)
  • Imran Razzak - , University of New South Wales (Autor:in)
  • Tianbin Li - , Shanghai Artificial Intelligence Laboratory (Autor:in)
  • Jin Ye - , Shanghai Artificial Intelligence Laboratory (Autor:in)
  • Junjun He - , Shanghai Artificial Intelligence Laboratory (Autor:in)
  • Szymon Płotka - , University of Amsterdam, Amsterdam University Medical Centers (UMC), Sano Centre for Computational Medicine, Jagiellonian University in Kraków (Autor:in)
  • Joanna Kaleta - , Sano Centre for Computational Medicine (Autor:in)
  • Amine Yamlahi - , Deutsches Krebsforschungszentrum (DKFZ) (Autor:in)
  • Antoine Jund - , Deutsches Krebsforschungszentrum (DKFZ) (Autor:in)
  • Patrick Godau - , Deutsches Krebsforschungszentrum (DKFZ), Universität Heidelberg (Autor:in)
  • Satoshi Kondo - , Muroran Institute of Technology (Autor:in)
  • Satoshi Kasai - , Niigata University of Health and Welfare (Autor:in)
  • Kousuke Hirasawa - , Konica Minolta Inc (Autor:in)
  • Dominik Rivoir - , Nationales Centrum für Tumorerkrankungen Dresden, Exzellenzcluster CeTI: Zentrum für Taktiles Internet (Autor:in)
  • Stefanie Speidel - , Nationales Centrum für Tumorerkrankungen Dresden, Exzellenzcluster CeTI: Zentrum für Taktiles Internet (Autor:in)
  • Alejandra Pérez - , Universidad de los Andes Colombia (Autor:in)
  • Santiago Rodriguez - , Universidad de los Andes Colombia (Autor:in)
  • Pablo Arbeláez - , Universidad de los Andes Colombia (Autor:in)
  • Danail Stoyanov - , University College London (Autor:in)
  • Hani J. Marcus - , University College London (Autor:in)
  • Sophia Bano - , University College London (Autor:in)

Abstract

The field of computer vision applied to videos of minimally invasive surgery is ever-growing. Workflow recognition pertains to the automated recognition of various aspects of a surgery, including: which surgical steps are performed; and which surgical instruments are used. This information can later be used to assist clinicians when learning the surgery or during live surgery. The Pituitary Vision (PitVis) 2023 Challenge tasks the community to step and instrument recognition in videos of endoscopic pituitary surgery. This is a particularly challenging task when compared to other minimally invasive surgeries due to: the smaller working space, which limits and distorts vision; and higher frequency of instrument and step switching, which requires more precise model predictions. Participants were provided with 25-videos, with results presented at the MICCAI-2023 conference as part of the Endoscopic Vision 2023 Challenge in Vancouver, Canada, on 08-Oct-2023. There were 18-submissions from 9-teams across 6-countries, using a variety of deep learning models. The top performing model for step recognition utilised a transformer based architecture, uniquely using an autoregressive decoder with a positional encoding input. The top performing model for instrument recognition utilised a spatial encoder followed by a temporal encoder, which uniquely used a 2-layer temporal architecture. In both cases, these models outperformed purely spatial based models, illustrating the importance of sequential and temporal information. This PitVis-2023 therefore demonstrates state-of-the-art computer vision models in minimally invasive surgery are transferable to a new dataset. Benchmark results are provided in the paper, and the dataset is publicly available at: https://doi.org/10.5522/04/26531686.

Details

OriginalspracheEnglisch
Aufsatznummer103716
FachzeitschriftMedical Image Analysis
Jahrgang106
PublikationsstatusVeröffentlicht - Dez. 2025
Peer-Review-StatusJa

Externe IDs

ORCID /0000-0002-4590-1908/work/190134676

Schlagworte

Schlagwörter

  • Endoscopic vision, Instrument recognition, Step recognition, Surgical AI, Surgical vision, Workflow analysis