TU_DBS in the ARQMath Lab 2021, CLEF
Publikation: Beitrag in Buch/Konferenzbericht/Sammelband/Gutachten › Beitrag in Konferenzband › Beigetragen › Begutachtung
Beitragende
Abstract
Mathematical Information Retrieval (MIR) deals with the task of finding relevant documents that contain text and mathematical formulas. Therefore, retrieval systems should not only be able to process natural language, but also mathematical and scientific notation to retrieve documents. The goal of this work is to review the participation of our team in the ARQMath 2021 Lab where two different approaches based on ALBERT and ColBERT were applied to a Question Answer Retrieval task and a Formula Similarity task. The ALBERT-based classification approach received competitive results for the first task. We found that by pre-training on data separated in chunks of text and formulas, the model performed better on formula data. This way of pre-training could also be beneficial for the Formula Search task.
Details
| Originalsprache | Englisch |
|---|---|
| Titel | Proceedings of the Working Notes of CLEF 2021 - Conference and Labs of the Evaluation Forum, Bucharest, Romania, September 21st - to - 24th, 2021 |
| Seiten | 107-124 |
| Seitenumfang | 18 |
| Publikationsstatus | Veröffentlicht - 2021 |
| Peer-Review-Status | Ja |
Publikationsreihe
| Reihe | CEUR Workshop Proceedings |
|---|---|
| Band | 2936 |
| ISSN | 1613-0073 |
Konferenz
| Titel | 12th Conference and Labs of the Evaluation Forum |
|---|---|
| Kurztitel | CLEF 2021 |
| Veranstaltungsnummer | 12 |
| Dauer | 21 - 24 September 2021 |
| Webseite | |
| Ort | Online |
| Stadt | Bucharest |
| Land | Rumänien |
Externe IDs
| Scopus | 85113430502 |
|---|---|
| ORCID | /0000-0001-8107-2775/work/142253437 |
| gvk | 181323857X |
Schlagworte
ASJC Scopus Sachgebiete
Schlagwörter
- BERT-based models, Information retrieval, Mathematical language processing