TU_DBS in the ARQMath Lab 2021, CLEF
Publikation: Beitrag in Buch/Konferenzbericht/Sammelband/Gutachten › Beitrag in Konferenzband › Beigetragen › Begutachtung
Beitragende
Abstract
Mathematical Information Retrieval (MIR) deals with the task of finding relevant documents that contain text and mathematical formulas. Therefore, retrieval systems should not only be able to process natural language, but also mathematical and scientific notation to retrieve documents. The goal of this work is to review the participation of our team in the ARQMath 2021 Lab where two different approaches based on ALBERT and ColBERT were applied to a Question Answer Retrieval task and a Formula Similarity task. The ALBERT-based classification approach received competitive results for the first task. We found that by pre-training on data separated in chunks of text and formulas, the model performed better on formula data. This way of pre-training could also be beneficial for the Formula Search task.
Details
Originalsprache | Englisch |
---|---|
Titel | Proceedings of the Working Notes of CLEF 2021 - Conference and Labs of the Evaluation Forum, Bucharest, Romania, September 21st - to - 24th, 2021 |
Seiten | 107-124 |
Seitenumfang | 18 |
Band | 2936 |
Publikationsstatus | Veröffentlicht - 2021 |
Peer-Review-Status | Ja |
Publikationsreihe
Reihe | CEUR Workshop Proceedings |
---|---|
Band | 2936 |
ISSN | 1613-0073 |
Konferenz
Titel | 2021 Working Notes of CLEF - Conference and Labs of the Evaluation Forum, CLEF-WN 2021 |
---|---|
Dauer | 21 - 24 September 2021 |
Stadt | Virtual, Bucharest |
Land | Rumänien |
Externe IDs
Scopus | 85113430502 |
---|---|
ORCID | /0000-0001-8107-2775/work/142253437 |
Schlagworte
ASJC Scopus Sachgebiete
Schlagwörter
- BERT-based models, Information retrieval, Mathematical language processing