Seeing More with Less:Video Capsule Endoscopy with Multi-task Learning

Julia Werner; Oliver Bause; Julius Oexle; Maxime Le Floch; Franz Brinkmann; Jochen Hampe; Oliver Bringmann

doi:10.1007/978-3-032-09569-5_2

Seeing More with Less:Video Capsule Endoscopy with Multi-task Learning

Research output: Contribution to book/Conference proceedings/Anthology/Report › Conference contribution › Contributed › peer-review

Contributors

Julia Werner - , University of Tübingen (Author)
Oliver Bause - , University of Tübingen (Author)
Julius Oexle - , University of Tübingen (Author)
Maxime Le Floch - , Department of Internal Medicine I, Division Transfusion Medicine, Else Kröner Fresenius Center for Digital Health (Author)
Franz Brinkmann - , Department of Internal Medicine I, Else Kröner Fresenius Center for Digital Health (Author)
Jochen Hampe - , Department of Internal Medicine I, Else Kröner Fresenius Center for Digital Health (Author)
Oliver Bringmann - , University of Tübingen (Author)

Abstract

Video capsule endoscopy has become increasingly important for investigating the small intestine within the gastrointestinal tract. However, a persistent challenge remains the short battery lifetime of such compact sensor edge devices. Integrating artificial intelligence can help overcome this limitation by enabling intelligent real-time decision-making, thereby reducing the energy consumption and prolonging the battery life. However, this remains challenging due to data sparsity and the limited resources of the device restricting the overall model size. In this work, we introduce a multi-task neural network that combines the functionalities of precise self-localization within the gastrointestinal tract with the ability to detect anomalies in the small intestine within a single model. Throughout the development process, we consistently restricted the total number of parameters to ensure the feasibility to deploy such model in a small capsule. We report the first multi-task results using the recently published Galar dataset, integrating established multi-task methods and Viterbi decoding for subsequent time-series analysis. This outperforms current single-task models and represents a significant advance in AI-based approaches in this field. Our model achieves an accuracy of 93.63% on the localization task and an accuracy of 87.48% on the anomaly detection task. The approach requires only 1 million parameters while surpassing the current baselines.

Details

Original language	English
Title of host publication	Applications of Medical Artificial Intelligence
Editors	Shandong Wu, Behrouz Shabestari, Lei Xing
Publisher	Springer Science and Business Media B.V.
Pages	12-21
Number of pages	10
ISBN (electronic)	978-3-032-09569-5
ISBN (print)	978-3-032-09568-8
Publication status	Published - 2026
Peer-reviewed	Yes

Publication series

Series	Lecture Notes in Computer Science
Volume	16206 LNCS
ISSN	0302-9743

Workshop

Title	4th International Workshop on Applications of Medical Artificial Intelligence
Abbreviated title	AMAI 2025
Conference number	4
Duration	23 September 2025
Website	https://sites.google.com/view/amai2025/home
Location	Daejeon Convention Center
City	Daejeon
Country	Korea, Republic of

External IDs

ORCID	/0000-0002-3474-3115/work/203813538
ORCID	/0000-0002-2421-6127/work/203813562

Research Portal of the TU Dresden