Investigating the Usage of Formulae in Mathematical Answer Retrieval

Publikation: Beitrag in Buch/Konferenzbericht/Sammelband/GutachtenBeitrag in KonferenzbandBeigetragenBegutachtung

Beitragende

Abstract

This work focuses on the task of Mathematical Answer Retrieval and studies the factors a recent Transformer-Encoder-based Language Model (LM) uses to assess the relevance of an answer for a given mathematical question. Mainly, we investigate three factors: (1) the general influence of mathematical formulae, (2) the usage of structural information of those formulae, (3) the overlap of variable names in answers and questions. The findings of the investigation indicate that the LM for Mathematical Answer Retrieval mainly relies on shallow features such as the overlap of variables between question and answers. Furthermore, we identified a malicious shortcut in the training data that hinders the usage of structural information and by removing this shortcut improved the overall accuracy. We want to foster future research on how LMs are trained for Mathematical Answer Retrieval and provide a basic evaluation set up (Link to repository: https://github.com/AnReu/math_analysis) for existing models.

Details

OriginalspracheEnglisch
TitelAdvances in Information Retrieval - 46th European Conference on Information Retrieval, ECIR 2024, Proceedings
Redakteure/-innenNazli Goharian, Nicola Tonellotto, Yulan He, Aldo Lipani, Graham McDonald, Craig Macdonald, Iadh Ounis
Herausgeber (Verlag)Springer Science and Business Media B.V.
Seiten247-261
Seitenumfang15
ISBN (elektronisch)978-3-031-56027-9
ISBN (Print)978-3-031-56026-2
PublikationsstatusVeröffentlicht - 2024
Peer-Review-StatusJa

Publikationsreihe

ReiheLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Band14608 LNCS
ISSN0302-9743

Konferenz

Titel46th European Conference on Information Retrieval
KurztitelECIR 2024
Veranstaltungsnummer46
Dauer24 - 28 März 2024
Webseite
OrtRadisson Blu Hotel
StadtGlasgow
LandGroßbritannien/Vereinigtes Königreich

Schlagworte

Schlagwörter

  • Mathematical Information Retrieval, Transformer-Encoders