Embedded Named Entity Recognition using Probing Classifiers

Research output: Contribution to book/Conference proceedings/Anthology/ReportConference contributionContributedpeer-review

Abstract

Streaming text generation, has become a common way of increasing the responsiveness of language model powered applications such as chat assistants. At the same time, extracting semantic information from generated text is a useful tool for applications such as automated fact checking or retrieval augmented generation. Currently, this requires either separate models during inference, which increases computational cost, or destructive fine-tuning of the language model. Instead, we propose an approach called EMBER which enables streaming named entity recognition in decoder-only language models without fine-tuning them and while incurring minimal additional computational cost at inference time. Specifically, our experiments show that EMBER maintains high token generation rates, with only a negligible decrease in speed of around 1% compared to a 43.64% slowdown measured for a baseline. We make our code and data available online, including a toolkit for training, testing, and deploying efficient token classification models optimized for streaming text generation.

Details

Original languageEnglish
Title of host publication"Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
EditorsYaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Place of PublicationMiami, Florida, USA
PublisherAssociation for Computational Linguistics (ACL)
Pages17830-17850
Number of pages21
ISBN (electronic)9798891761643
Publication statusPublished - Nov 2024
Peer-reviewedYes

Conference

Title2024 Conference on Empirical Methods in Natural Language Processing
Abbreviated titleEMNLP 2024
Duration12 - 16 November 2024
Website
Degree of recognitionInternational event
LocationHyatt Regency Miami Hotel & Online
CityMiami
CountryUnited States of America

External IDs

Scopus 85217785401