GraSAME: Injecting Token-Level Structural Information to Pretrained Language Models via Graph-guided Self-Attention Mechanism

Shuzhou Yuan; Michael Färber

doi:10.18653/v1/2024.findings-naacl.58

GraSAME: Injecting Token-Level Structural Information to Pretrained Language Models via Graph-guided Self-Attention Mechanism

Research output: Contribution to book/Conference proceedings/Anthology/Report › Conference contribution › Contributed › peer-review

Contributors

Shuzhou Yuan - , Chair of Data Science (ScaDS.AI Dresden/Leipzig), Chair of Scalable Software Architectures for Data Analytics (ScaDS.AI Dresden/Leipzig), Karlsruhe Institute of Technology (Author)
Michael Färber - , Chair of Scalable Software Architectures for Data Analytics (ScaDS.AI Dresden/Leipzig), Center for Scalable Data Analytics and Artificial Intelligence (ScaDS.AI Dresden), Karlsruhe Institute of Technology (Author)

Abstract

Pretrained Language Models (PLMs) benefit from external knowledge stored in graph structures for various downstream tasks. However, bridging the modality gap between graph structures and text remains a significant challenge. Traditional methods like linearizing graphs for PLMs lose vital graph connectivity, whereas Graph Neural Networks (GNNs) require cumbersome processes for integration into PLMs. In this work, we propose a novel graph-guided self-attention mechanism, GraSAME. GraSAME seamlessly incorporates token-level structural information into PLMs without necessitating additional alignment or concatenation efforts. As an end-to-end, lightweight multimodal module, GraSAME follows a multi-task learning strategy and effectively bridges the gap between graph and textual modalities, facilitating dynamic interactions between GNNs and PLMs. Our experiments on the graph-to-text generation task demonstrate that GraSAME outperforms baseline models and achieves results comparable to state-of-the-art (SOTA) models on WebNLG datasets. Furthermore, compared to SOTA models, GraSAME eliminates the need for extra pre-training tasks to adjust graph inputs and reduces the number of trainable parameters by over 100 million.

Details

Original language	English
Title of host publication	Findings of the Association for Computational Linguistics
Editors	Kevin Duh, Helena Gomez, Steven Bethard
Place of Publication	Mexico City, Mexico
Publisher	Association for Computational Linguistics (ACL)
Pages	920–933
Number of pages	14
ISBN (electronic)	979-889176119-3
Publication status	Published - Jun 2024
Peer-reviewed	Yes

Conference

Title	2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Abbreviated title	NAACL 2024
Duration	16 - 21 June 2024
Website	https://2024.naacl.org/, external link
Location	Hilton Reforma Mexico City
City	Mexico City
Country	Mexico

External IDs

Scopus	85197895877

Research Portal of the TU Dresden

GraSAME: Injecting Token-Level Structural Information to Pretrained Language Models via Graph-guided Self-Attention Mechanism

Contributors

Abstract

Details

Conference

External IDs

Keywords

Research priority areas of TU Dresden

ASJC Scopus subject areas