Transfer Learning for Domain-Specific Named Entity Recognition in German

Sunna Torge; Waldemar Hahn; René Jäkel

doi:10.1109/CiSt49399.2021.9357262

Transfer Learning for Domain-Specific Named Entity Recognition in German

Research output: Contribution to book/Conference proceedings/Anthology/Report › Conference contribution › Contributed › peer-review

Contributors

Sunna Torge - , Center for Information Services and High Performance Computing (ZIH) (Author)
Waldemar Hahn - , Center for Information Services and High Performance Computing (ZIH) (Author)
René Jäkel - , Center for Information Services and High Performance Computing (ZIH) (Author)

Abstract

Automated text analysis as named entity recognition (NER) heavily relies on large amounts of high-quality training data. Transfer learning approaches aim to overcome the problem of lacking domain-specific training data. In this paper, we investigate different transfer learning approaches to recognize unknown domain-specific entities, including the influence on varying training data size. The experiments are based on the revised German SmartData Corpus, and a baseline model, trained on this corpus.

Details

Original language	English
Title of host publication	2020 6th IEEE Congress on Information Science and Technology (CiSt)
Publisher	Wiley-IEEE Press
Pages	321-327
Number of pages	7
ISBN (electronic)	9781728166469
ISBN (print)	978-1-7281-6647-6
Publication status	Published - 12 Jun 2021
Peer-reviewed	Yes

Conference

Title	2020 6th IEEE Congress on Information Science and Technology (CiSt)
Duration	5 - 12 June 2021
Location	Agadir - Essaouira, Morocco

External IDs

Scopus	85103860847
Ieee	10.1109/CiSt49399.2021.9357262
ORCID	/0000-0001-9756-6390/work/154741803

Keywords

Annotations, Data models, Text recognition, Training, Training data, Transfer learning, Vocabulary, NER, Named Entity Recognition System, transfer learning

Research Portal of the TU Dresden

Contributors

Abstract

Details

Conference

External IDs

Keywords

Keywords