Seeing More with Less:Video Capsule Endoscopy with Multi-task Learning

Research output: Contribution to book/Conference proceedings/Anthology/ReportConference contributionContributedpeer-review

Contributors

Abstract

Video capsule endoscopy has become increasingly important for investigating the small intestine within the gastrointestinal tract. However, a persistent challenge remains the short battery lifetime of such compact sensor edge devices. Integrating artificial intelligence can help overcome this limitation by enabling intelligent real-time decision-making, thereby reducing the energy consumption and prolonging the battery life. However, this remains challenging due to data sparsity and the limited resources of the device restricting the overall model size. In this work, we introduce a multi-task neural network that combines the functionalities of precise self-localization within the gastrointestinal tract with the ability to detect anomalies in the small intestine within a single model. Throughout the development process, we consistently restricted the total number of parameters to ensure the feasibility to deploy such model in a small capsule. We report the first multi-task results using the recently published Galar dataset, integrating established multi-task methods and Viterbi decoding for subsequent time-series analysis. This outperforms current single-task models and represents a significant advance in AI-based approaches in this field. Our model achieves an accuracy of 93.63% on the localization task and an accuracy of 87.48% on the anomaly detection task. The approach requires only 1 million parameters while surpassing the current baselines.

Details

Original languageEnglish
Title of host publicationApplications of Medical Artificial Intelligence
EditorsShandong Wu, Behrouz Shabestari, Lei Xing
PublisherSpringer Science and Business Media B.V.
Pages12-21
Number of pages10
ISBN (electronic)978-3-032-09569-5
ISBN (print)978-3-032-09568-8
Publication statusPublished - 2026
Peer-reviewedYes

Publication series

SeriesLecture notes in computer science
Volume16206 LNCS
ISSN0302-9743

Workshop

Title4th International Workshop on Applications of Medical Artificial Intelligence
Abbreviated titleAMAI 2025
Conference number4
Duration23 September 2025
Website
LocationDaejeon Convention Center
CityDaejeon
CountryKorea, Republic of

External IDs

ORCID /0000-0002-3474-3115/work/203813538
ORCID /0000-0002-2421-6127/work/203813562

Keywords

Sustainable Development Goals

Keywords

  • Multi-Task Learning, Video Capsule Endoscopy, Viterbi decoding