Transformer-Encoder and Decoder Models for Questions on Math

Research output: Contribution to book/Conference proceedings/Anthology/ReportConference contributionContributedpeer-review

Contributors

Abstract

This work summarizes our submission to ARQMath-3. We pre-trained Transformer-Encoder-based Language Models for the task of mathematical answer retrieval and employed a Transformer-Decoder Model for the generation of answers given a question from a mathematical domain. In comparison to our submission to ARQmath-2, we could improve the performance of our models regarding all three metrics nDGC’, mAP’ and p’@10 by refined pre-training and enlarged fine-tuning data. In addition, we improved our p’@10 results even further by additionally fine-tuning on annotated test data from ARQMath-2. In summary, our findings confirm that Transformer-based models benefit from domain adaptive pre-training in the mathematical domain.

Details

Original languageEnglish
Title of host publicationCLEF 2022 Working Notes
EditorsGuglielmo Faggioli, Nicola Ferro, Allan Hanbury, Martin Potthast
Pages119-137
Number of pages19
Publication statusPublished - 2022
Peer-reviewedYes

Publication series

SeriesCEUR Workshop Proceedings
Volume3180
ISSN1613-0073

Conference

Title13th Conference and Labs of the Evaluation Forum
SubtitleInformation Access Evaluation meets Multilinguality, Multimodality, and Visualization
Abbreviated titleCLEF 2022
Conference number13
Duration5 - 8 September 2022
Website
LocationUniversità di Bologna
CityBologna
CountryItaly

External IDs

ORCID /0000-0001-8107-2775/work/194824074

Keywords

ASJC Scopus subject areas

Keywords

  • Information Retrieval, Mathematical Language Processing, Transformer-based Models