Embedded Named Entity Recognition using Probing Classifiers
Research output: Contribution to book/Conference proceedings/Anthology/Report › Conference contribution › Contributed › peer-review
Contributors
Abstract
Streaming text generation, has become a common way of increasing the responsiveness of language model powered applications such as chat assistants. At the same time, extracting semantic information from generated text is a useful tool for applications such as automated fact checking or retrieval augmented generation. Currently, this requires either separate models during inference, which increases computational cost, or destructive fine-tuning of the language model. Instead, we propose an approach called EMBER which enables streaming named entity recognition in decoder-only language models without fine-tuning them and while incurring minimal additional computational cost at inference time. Specifically, our experiments show that EMBER maintains high token generation rates, with only a negligible decrease in speed of around 1% compared to a 43.64% slowdown measured for a baseline. We make our code and data available online, including a toolkit for training, testing, and deploying efficient token classification models optimized for streaming text generation.
Details
Original language | English |
---|---|
Title of host publication | "Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing |
Editors | Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen |
Place of Publication | Miami, Florida, USA |
Publisher | Association for Computational Linguistics (ACL) |
Pages | 17830-17850 |
Number of pages | 21 |
ISBN (electronic) | 9798891761643 |
Publication status | Published - Nov 2024 |
Peer-reviewed | Yes |
Conference
Title | 2024 Conference on Empirical Methods in Natural Language Processing |
---|---|
Abbreviated title | EMNLP 2024 |
Duration | 12 - 16 November 2024 |
Website | |
Degree of recognition | International event |
Location | Hyatt Regency Miami Hotel & Online |
City | Miami |
Country | United States of America |
External IDs
Scopus | 85217785401 |
---|