Paths to Causality: Finding Informative Subgraphs Within Knowledge Graphs for Knowledge-Based Causal Discovery

Publikation: Beitrag in Buch/Konferenzbericht/Sammelband/GutachtenBeitrag in KonferenzbandBeigetragenBegutachtung

Abstract

Inferring causal relationships between variable pairs is crucial for understanding multivariate interactions in complex systems. Knowledge-based causal discovery -which involves inferring causal relationships by reasoning over the metadata of variables (e.g., names or textual context)-offers a compelling alternative to traditional methods that rely on observational data. However, existing methods using Large Language Models (LLMs) often produce unstable and inconsistent results, compromising their reliability for causal inference. To address this, we introduce a novel approach that integrates Knowledge Graphs (KGs) with LLMs to enhance knowledge-based causal discovery. Our approach identifies informative metapath -based subgraphs within KGs and further refines the selection of these subgraphs using Learning-to-Rank-based models. The top-ranked subgraphs are then incorporated into zero-shot prompts, improving the effectiveness of LLMs in inferring the causal relationship. Extensive experiments on biomedical and open-domain datasets demonstrate that our method outperforms most baselines by up to 44.4 points in F1 scores, evaluated across diverse LLMs and KGs. Our code and datasets are available on GitHub: https://github.com/susantiyuni/path-to-causality

Details

OriginalspracheEnglisch
TitelKDD 2025 - Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining
Herausgeber (Verlag)Association for Computing Machinery
Seiten2778-2789
Seitenumfang12
ISBN (elektronisch)979-8-4007-1454-2
PublikationsstatusVeröffentlicht - 3 Aug. 2025
Peer-Review-StatusJa

Konferenz

Titel31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining
KurztitelKDD 2025
Veranstaltungsnummer31
Dauer3 - 7 August 2025
Webseite
OrtToronto Convention Centre
StadtToronto
LandKanada

Externe IDs

ORCID /0000-0001-5458-8645/work/193180546

Schlagworte

ASJC Scopus Sachgebiete

Schlagwörter

  • causal discovery, knowledge graphs, large language models