Investigating the Usage of Formulae in Mathematical Answer Retrieval
Publikation: Beitrag in Buch/Konferenzbericht/Sammelband/Gutachten › Beitrag in Konferenzband › Beigetragen › Begutachtung
Beitragende
Abstract
This work focuses on the task of Mathematical Answer Retrieval and studies the factors a recent Transformer-Encoder-based Language Model (LM) uses to assess the relevance of an answer for a given mathematical question. Mainly, we investigate three factors: (1) the general influence of mathematical formulae, (2) the usage of structural information of those formulae, (3) the overlap of variable names in answers and questions. The findings of the investigation indicate that the LM for Mathematical Answer Retrieval mainly relies on shallow features such as the overlap of variables between question and answers. Furthermore, we identified a malicious shortcut in the training data that hinders the usage of structural information and by removing this shortcut improved the overall accuracy. We want to foster future research on how LMs are trained for Mathematical Answer Retrieval and provide a basic evaluation set up (Link to repository: https://github.com/AnReu/math_analysis) for existing models.
Details
Originalsprache | Englisch |
---|---|
Titel | Advances in Information Retrieval - 46th European Conference on Information Retrieval, ECIR 2024, Proceedings |
Redakteure/-innen | Nazli Goharian, Nicola Tonellotto, Yulan He, Aldo Lipani, Graham McDonald, Craig Macdonald, Iadh Ounis |
Herausgeber (Verlag) | Springer Science and Business Media B.V. |
Seiten | 247-261 |
Seitenumfang | 15 |
ISBN (elektronisch) | 978-3-031-56027-9 |
ISBN (Print) | 978-3-031-56026-2 |
Publikationsstatus | Veröffentlicht - 2024 |
Peer-Review-Status | Ja |
Publikationsreihe
Reihe | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|
Band | 14608 LNCS |
ISSN | 0302-9743 |
Konferenz
Titel | 46th European Conference on Information Retrieval |
---|---|
Kurztitel | ECIR 2024 |
Veranstaltungsnummer | 46 |
Dauer | 24 - 28 März 2024 |
Webseite | |
Ort | Radisson Blu Hotel |
Stadt | Glasgow |
Land | Großbritannien/Vereinigtes Königreich |
Schlagworte
ASJC Scopus Sachgebiete
Schlagwörter
- Mathematical Information Retrieval, Transformer-Encoders