GraSAME: Injecting Token-Level Structural Information to Pretrained Language Models via Graph-guided Self-Attention Mechanism

Shuzhou Yuan; Michael Färber

doi:10.18653/v1/2024.findings-naacl.58

GraSAME: Injecting Token-Level Structural Information to Pretrained Language Models via Graph-guided Self-Attention Mechanism

Publikation: Beitrag in Buch/Konferenzbericht/Sammelband/Gutachten › Beitrag in Konferenzband › Beigetragen › Begutachtung

Beitragende

Shuzhou Yuan - , Professur für Datenwissenschaften (ScaDS.AI Dresden/Leipzig), Professur für Skalierbare Software-Architekturen für Data Analytics (ScaDS.AI Dresden/Leipzig), Karlsruher Institut für Technologie (Autor:in)
Michael Färber - , Professur für Skalierbare Software-Architekturen für Data Analytics (ScaDS.AI Dresden/Leipzig), Center for Scalable Data Analytics and Artificial Intelligence (ScaDS.AI Dresden), Karlsruher Institut für Technologie (Autor:in)

Abstract

Pretrained Language Models (PLMs) benefit from external knowledge stored in graph structures for various downstream tasks. However, bridging the modality gap between graph structures and text remains a significant challenge. Traditional methods like linearizing graphs for PLMs lose vital graph connectivity, whereas Graph Neural Networks (GNNs) require cumbersome processes for integration into PLMs. In this work, we propose a novel graph-guided self-attention mechanism, GraSAME. GraSAME seamlessly incorporates token-level structural information into PLMs without necessitating additional alignment or concatenation efforts. As an end-to-end, lightweight multimodal module, GraSAME follows a multi-task learning strategy and effectively bridges the gap between graph and textual modalities, facilitating dynamic interactions between GNNs and PLMs. Our experiments on the graph-to-text generation task demonstrate that GraSAME outperforms baseline models and achieves results comparable to state-of-the-art (SOTA) models on WebNLG datasets. Furthermore, compared to SOTA models, GraSAME eliminates the need for extra pre-training tasks to adjust graph inputs and reduces the number of trainable parameters by over 100 million.

Details

Originalsprache	Englisch
Titel	Findings of the Association for Computational Linguistics
Redakteure/-innen	Kevin Duh, Helena Gomez, Steven Bethard
Erscheinungsort	Mexico City, Mexico
Herausgeber (Verlag)	Association for Computational Linguistics (ACL)
Seiten	920–933
Seitenumfang	14
ISBN (elektronisch)	979-889176119-3
Publikationsstatus	Veröffentlicht - Juni 2024
Peer-Review-Status	Ja

Konferenz

Titel	2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Kurztitel	NAACL 2024
Dauer	16 - 21 Juni 2024
Webseite	https://2024.naacl.org/, Externer Link
Ort	Hilton Reforma Mexico City
Stadt	Mexico City
Land	Mexiko

Externe IDs

Scopus	85197895877

Forschungsportal der TU Dresden

GraSAME: Injecting Token-Level Structural Information to Pretrained Language Models via Graph-guided Self-Attention Mechanism

Beitragende

Abstract

Details

Konferenz

Externe IDs

Schlagworte

Forschungsprofillinien der TU Dresden

ASJC Scopus Sachgebiete