TU_DBS in the ARQMath Lab 2021, CLEF
Research output: Contribution to book/Conference proceedings/Anthology/Report › Conference contribution › Contributed › peer-review
Contributors
Abstract
Mathematical Information Retrieval (MIR) deals with the task of finding relevant documents that contain text and mathematical formulas. Therefore, retrieval systems should not only be able to process natural language, but also mathematical and scientific notation to retrieve documents. The goal of this work is to review the participation of our team in the ARQMath 2021 Lab where two different approaches based on ALBERT and ColBERT were applied to a Question Answer Retrieval task and a Formula Similarity task. The ALBERT-based classification approach received competitive results for the first task. We found that by pre-training on data separated in chunks of text and formulas, the model performed better on formula data. This way of pre-training could also be beneficial for the Formula Search task.
Details
Original language | English |
---|---|
Title of host publication | Proceedings of the Working Notes of CLEF 2021 - Conference and Labs of the Evaluation Forum, Bucharest, Romania, September 21st - to - 24th, 2021 |
Pages | 107-124 |
Number of pages | 18 |
Volume | 2936 |
Publication status | Published - 2021 |
Peer-reviewed | Yes |
Publication series
Series | CEUR Workshop Proceedings |
---|---|
Volume | 2936 |
ISSN | 1613-0073 |
Conference
Title | 2021 Working Notes of CLEF - Conference and Labs of the Evaluation Forum, CLEF-WN 2021 |
---|---|
Duration | 21 - 24 September 2021 |
City | Virtual, Bucharest |
Country | Romania |
External IDs
Scopus | 85113430502 |
---|---|
ORCID | /0000-0001-8107-2775/work/142253437 |
Keywords
ASJC Scopus subject areas
Keywords
- BERT-based models, Information retrieval, Mathematical language processing