Seeing More with Less:Video Capsule Endoscopy with Multi-task Learning

Julia Werner; Oliver Bause; Julius Oexle; Maxime Le Floch; Franz Brinkmann; Jochen Hampe; Oliver Bringmann

doi:10.1007/978-3-032-09569-5_2

Seeing More with Less:Video Capsule Endoscopy with Multi-task Learning

Publikation: Beitrag in Buch/Konferenzbericht/Sammelband/Gutachten › Beitrag in Konferenzband › Beigetragen › Begutachtung

Beitragende

Julia Werner - , Eberhard Karls Universität Tübingen (Autor:in)
Oliver Bause - , Eberhard Karls Universität Tübingen (Autor:in)
Julius Oexle - , Eberhard Karls Universität Tübingen (Autor:in)
Maxime Le Floch - , Medizinische Klinik und Poliklinik I, Bereich Transfusionsmedizin, Else Kröner Fresenius Zentrum für Digitale Gesundheit (Autor:in)
Franz Brinkmann - , Medizinische Klinik und Poliklinik I, Else Kröner Fresenius Zentrum für Digitale Gesundheit (Autor:in)
Jochen Hampe - , Medizinische Klinik und Poliklinik I, Else Kröner Fresenius Zentrum für Digitale Gesundheit (Autor:in)
Oliver Bringmann - , Eberhard Karls Universität Tübingen (Autor:in)

Abstract

Video capsule endoscopy has become increasingly important for investigating the small intestine within the gastrointestinal tract. However, a persistent challenge remains the short battery lifetime of such compact sensor edge devices. Integrating artificial intelligence can help overcome this limitation by enabling intelligent real-time decision-making, thereby reducing the energy consumption and prolonging the battery life. However, this remains challenging due to data sparsity and the limited resources of the device restricting the overall model size. In this work, we introduce a multi-task neural network that combines the functionalities of precise self-localization within the gastrointestinal tract with the ability to detect anomalies in the small intestine within a single model. Throughout the development process, we consistently restricted the total number of parameters to ensure the feasibility to deploy such model in a small capsule. We report the first multi-task results using the recently published Galar dataset, integrating established multi-task methods and Viterbi decoding for subsequent time-series analysis. This outperforms current single-task models and represents a significant advance in AI-based approaches in this field. Our model achieves an accuracy of 93.63% on the localization task and an accuracy of 87.48% on the anomaly detection task. The approach requires only 1 million parameters while surpassing the current baselines.

Details

Originalsprache	Englisch
Titel	Applications of Medical Artificial Intelligence
Redakteure/-innen	Shandong Wu, Behrouz Shabestari, Lei Xing
Herausgeber (Verlag)	Springer Science and Business Media B.V.
Seiten	12-21
Seitenumfang	10
ISBN (elektronisch)	978-3-032-09569-5
ISBN (Print)	978-3-032-09568-8
Publikationsstatus	Veröffentlicht - 2026
Peer-Review-Status	Ja

Publikationsreihe

Reihe	Lecture Notes in Computer Science
Band	16206 LNCS
ISSN	0302-9743

Workshop

Titel	4th International Workshop on Applications of Medical Artificial Intelligence
Kurztitel	AMAI 2025
Veranstaltungsnummer	4
Dauer	23 September 2025
Webseite	https://sites.google.com/view/amai2025/home
Ort	Daejeon Convention Center
Stadt	Daejeon
Land	Südkorea

Externe IDs

ORCID	/0000-0002-3474-3115/work/203813538
ORCID	/0000-0002-2421-6127/work/203813562

Forschungsportal der TU Dresden

Seeing More with Less:Video Capsule Endoscopy with Multi-task Learning

Beitragende

Abstract

Details

Publikationsreihe

Workshop

Externe IDs

Schlagworte

Ziele für nachhaltige Entwicklung

ASJC Scopus Sachgebiete

Schlagwörter