An ALBERT-based Similarity Measure for Mathematical Answer Retrieval

Publikation: Beitrag in Buch/Konferenzbericht/Sammelband/GutachtenBeitrag in KonferenzbandBeigetragenBegutachtung

Beitragende

Abstract

Mathematical Language Processing (MLP) deals with the automated processing and analysis of mathematical documents and relies heavily on good representations of mathematical symbols and texts. The aim of this work is to explore the modeling capabilities of state-of-the-art unsupervised deep learning methods to create such representations. Therefore, we pre-trained different instances of an ALBERT model on Mathematics StackExchange data and fine-tuned it on the task of Mathematical Answer Retrieval. Our evaluation shows that ALBERT outperforms all previous systems and is on par with current state-of-the-art systems for math retrieval indicating strong capabilities of modeling mathematical posts. This implies that our approach can also be beneficial to various other tasks in MLP such as automatic proof checking or summarization of scientific texts.

Details

OriginalspracheEnglisch
TitelSIGIR '21: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval
Herausgeber (Verlag)Association for Computing Machinery, Inc
Seiten1593-1597
Seitenumfang5
ISBN (elektronisch)978-1-4503-8037-9
PublikationsstatusVeröffentlicht - 11 Juli 2021
Peer-Review-StatusJa

Publikationsreihe

ReiheIR: Research and Development in Information Retrieval

Konferenz

Titel44th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2021
Dauer11 - 15 Juli 2021
StadtVirtual, Online
LandKanada

Externe IDs

Scopus 85111688215
ORCID /0000-0001-8107-2775/work/142253439

Schlagworte

Schlagwörter

  • information retrieval, mathematical language processing