Smarter Self-distillation: Optimizing the Teacher for Surgical Video Applications
Research output: Contribution to book/Conference proceedings/Anthology/Report › Conference contribution › Contributed › peer-review
Contributors
Abstract
Surgical workflow analysis poses significant challenges due to complex imaging conditions, annotation ambiguities, and the large number of classes in tasks such as action recognition. Self-distillation (SD) has emerged as a promising technique to address these challenges by leveraging soft labels, but little is known about how to optimize the quality of these labels for surgical scene analysis. In this work, we thoroughly investigate this issue. First, we show that the quality of soft labels is highly sensitive to several design choices and that relying on a single top-performing teacher selected based on validation performance often leads to suboptimal results. Second, as a key technical innovation, we introduce a multi-teacher distillation strategy that ensembles checkpoints across seeds and epochs within a training phase where soft labels maintain an optimal balance—neither underconfident nor overconfident. By ensembling at the teacher level rather than the student level, our approach reduces computational overhead during inference. Finally, we validate our approach on three benchmark datasets, where it demonstrates consistent improvements over existing SD methods. Notably, our method sets a new state-of-the-art (SOTA) performance on the CholecTriplet benchmark, achieving a 43.1% mean Average Precision (mAP) score and real-time inference time, thereby establishing a new standard for surgical video analysis in challenging and ambiguous environments. Code available at https://github.com/IMSY-DKFZ/self-distilled-swin.
Details
| Original language | English |
|---|---|
| Title of host publication | Medical Image Computing and Computer Assisted Intervention, MICCAI 2025 |
| Editors | James C. Gee, Jaesung Hong, Carole H. Sudre, Polina Golland, Jinah Park, Daniel C. Alexander, Juan Eugenio Iglesias, Archana Venkataraman, Jong Hyo Kim |
| Publisher | Springer Science and Business Media B.V. |
| Pages | 522-531 |
| Number of pages | 10 |
| ISBN (electronic) | 978-3-032-05114-1 |
| ISBN (print) | 978-3-032-05113-4 |
| Publication status | Published - 2026 |
| Peer-reviewed | Yes |
Publication series
| Series | Lecture notes in computer science |
|---|---|
| Volume | 15968 LNCS |
| ISSN | 0302-9743 |
Conference
| Title | 28th International Conference on Medical Image Computing and Computer Assisted Intervention |
|---|---|
| Abbreviated title | MICCAI 2025 |
| Conference number | 28 |
| Duration | 23 - 27 September 2025 |
| Website | |
| Location | Daejeon Convention Center |
| City | Daejeon |
| Country | Korea, Republic of |
Keywords
ASJC Scopus subject areas
Keywords
- Self-Distillation, Soft labels optimization, Surgical Action Recognition