Comparing and Improving Active Learning Uncertainty Measures for Transformer Models by Discarding Outliers

Julius Gonsior; Christian Falkenberg; Silvio Magino; Anja Reusch; Claudio Hartmann; Maik Thiele; Wolfgang Lehner

doi:10.1007/s10796-024-10503-z

Comparing and Improving Active Learning Uncertainty Measures for Transformer Models by Discarding Outliers

Research output: Contribution to journal › Research article › Contributed › peer-review

Contributors

Julius Gonsior - , Chair of Databases (Author)
Christian Falkenberg - , TUD Dresden University of Technology (Author)
Silvio Magino - , TUD Dresden University of Technology (Author)
Anja Reusch - , Chair of Databases (Author)
Claudio Hartmann - , Chair of Databases (Author)
Maik Thiele - , Dresden University of Applied Sciences (HTW) (Author)
Wolfgang Lehner - , Chair of Databases (Author)

Abstract

Despite achieving state-of-the-art results in nearly all Natural Language Processing applications, fine-tuning Transformer-encoder based language models still requires a significant amount of labeled data to achieve satisfying work. A well known technique to reduce the amount of human effort in acquiring a labeled dataset is Active Learning (AL): an iterative process in which only the minimal amount of samples is labeled. AL strategies require access to a quantified confidence measure of the model predictions. A common choice is the softmax activation function for the final Neural Network layer. In this paper, we compare eight alternatives on seven datasets and show that the softmax function provides misleading probabilities. Our finding is that most of the methods primarily identify hard-to-learn-from samples (commonly called outliers), resulting in worse than random performance, instead of samples, which actually reduce the uncertainty of the learned language model. As a solution, this paper proposes Uncertainty-Clipping, a heuristic to systematically exclude samples, which results in improvements for most methods compared to the softmax function.

Details

Original language	English
Journal	Information systems frontiers
Publication status	E-pub ahead of print - 26 Jun 2024
Peer-reviewed	Yes

External IDs

ORCID	/0000-0001-8107-2775/work/174431839
ORCID	/0000-0002-5985-4348/work/174432432

Keywords

ASJC Scopus subject areas

Theoretical Computer Science
Software
Information Systems
Computer Networks and Communications

Keywords

Active Learning, Calibration, Deep Neural Networks, Softmax, Transformer, Uncertainty

Research Portal of the TU Dresden

Comparing and Improving Active Learning Uncertainty Measures for Transformer Models by Discarding Outliers

Contributors

Abstract

Details

External IDs

Keywords

ASJC Scopus subject areas

Keywords

Related content

Correction to: Comparing and Improving Active Learning Uncertainty Measures for Transformer Models by Discarding Outliers (Information Systems Frontiers, (2024), 10.1007/s10796-024-10503-z)