Paths to Causality: Finding Informative Subgraphs Within Knowledge Graphs for Knowledge-Based Causal Discovery

Research output: Contribution to book/Conference proceedings/Anthology/ReportConference contributionContributedpeer-review

Abstract

Inferring causal relationships between variable pairs is crucial for understanding multivariate interactions in complex systems. Knowledge-based causal discovery -which involves inferring causal relationships by reasoning over the metadata of variables (e.g., names or textual context)-offers a compelling alternative to traditional methods that rely on observational data. However, existing methods using Large Language Models (LLMs) often produce unstable and inconsistent results, compromising their reliability for causal inference. To address this, we introduce a novel approach that integrates Knowledge Graphs (KGs) with LLMs to enhance knowledge-based causal discovery. Our approach identifies informative metapath -based subgraphs within KGs and further refines the selection of these subgraphs using Learning-to-Rank-based models. The top-ranked subgraphs are then incorporated into zero-shot prompts, improving the effectiveness of LLMs in inferring the causal relationship. Extensive experiments on biomedical and open-domain datasets demonstrate that our method outperforms most baselines by up to 44.4 points in F1 scores, evaluated across diverse LLMs and KGs. Our code and datasets are available on GitHub.. https://github.com/susantiyuni/path-to-causality.

Details

Original languageEnglish
Title of host publicationKDD 2025 - Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining
PublisherAssociation for Computing Machinery
Pages2778-2789
Number of pages12
ISBN (electronic)979-8-4007-1454-2
Publication statusPublished - 3 Aug 2025
Peer-reviewedYes

Conference

Title31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining
Abbreviated titleKDD 2025
Conference number31
Duration3 - 7 August 2025
Website
LocationToronto Convention Centre
CityToronto
CountryCanada

External IDs

ORCID /0000-0001-5458-8645/work/193180546

Keywords

ASJC Scopus subject areas

Keywords

  • causal discovery, knowledge graphs, large language models