Comparing and Improving Active Learning Uncertainty Measures for Transformer Models

Publikation: Beitrag in Buch/Konferenzbericht/Sammelband/GutachtenBeitrag in KonferenzbandBeigetragenBegutachtung

Beitragende

Abstract

Despite achieving state-of-the-art results in nearly all Natural Language Processing applications, fine-tuning Transformer-encoder based language models still requires a significant amount of labeled data to achieve satisfying work. A well known technique to reduce the amount of human effort in acquiring a labeled dataset is Active Learning (AL): an iterative process in which only the minimal amount of samples is labeled. AL strategies require access to a quantified confidence measure of the model predictions. A common choice is the softmax activation function for the final Neural Network layer. In this paper we compare eight alternatives on seven datasets and show that the softmax function provides misleading probabilities. Our finding is that most of the methods primarily identify hard-to-learn-from samples (outliers), resulting in worse than random performance, instead of samples, which reduce the uncertainty of the learned language model. As a solution this paper proposes a heuristic to systematically exclude samples, which results in improvements of various methods compared to the softmax function.

Details

OriginalspracheEnglisch
TitelAdvances in Databases and Information Systems - 27th European Conference, ADBIS 2023, Proceedings
Redakteure/-innenAlberto Abelló, Oscar Romero, Panos Vassiliadis, Robert Wrembel
Herausgeber (Verlag)Springer Science and Business Media B.V.
Seiten119-132
Seitenumfang14
ISBN (Print)9783031429132
PublikationsstatusVeröffentlicht - 2023
Peer-Review-StatusJa

Publikationsreihe

ReiheLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Band13985 LNCS
ISSN0302-9743

Konferenz

Titel27th European Conference on Advances in Databases and Information Systems , ADBIS 2023
Dauer4 - 7 September 2023
StadtBarcelona
LandSpanien

Schlagworte

Schlagwörter

  • Active Learning, Calibration, Deep Neural Networks, Softmax, Transformer, Uncertainty