Combining Self-Retrieval-Augmented Generation with Divide-and-Conquer for Language Model-based Knowledge Base Construction
Research output: Contribution to book/Conference proceedings/Anthology/Report › Conference contribution › Contributed › peer-review
Contributors
Abstract
Knowledge base construction from language models (LMs) without external retrieval presents unique challenges. Therefore, we present a hybrid, LM-only system for the LM-KBC 2025 challenge [1], which requires constructing knowledge bases using a fixed model (Qwen3-8B) without fine-tuning or external retrieval. Our method combines Self-RAG for general relations with a divide-and-conquer module specialized for awardWonBy. Self-RAG follows a description-first, then extraction-second design with strict output specifications (names-only or one-number-only) to reduce reliance on brittle post-hoc cleaning; numeric answers are normalized to a canonical digit form. The divide-and-conquer module aggregates candidates from constrained, names-only subqueries and filters them with a strict name validator. Evaluation uses the organizers’ official string-matching metric. On the hidden test leaderboard, our system achieves the 2nd place out of 5 participants, and improves macro-F1 from 0.212 (baseline) to 0.405 (+0.194; ∼+91.5% relative improvement), with large gains on companyTradesAtStockExchange (+0.339), personHasCityOfDeath (+0.330), and countryLandBordersCountry (+0.162).
Details
| Original language | English |
|---|---|
| Title of host publication | KBC-LM Workshop and LM-KBC Challenge at ISWC 2025 |
| Editors | Simon Razniewski, Jan-Christoph Kalo, Duygu Islakoğlu, Tuan-Phong Nguyen, Bohui Zhang |
| Number of pages | 20 |
| Publication status | Published - 2025 |
| Peer-reviewed | Yes |
Publication series
| Series | CEUR Workshop Proceedings |
|---|---|
| Volume | 4041 |
| ISSN | 1613-0073 |
Other
| Title | 4th challenge on Knowledge Base Construction from Pre-trained Language Models |
|---|---|
| Abbreviated title | LM-KBC 2025 |
| Conference number | 4 |
| Description | co-located with the 24th International Semantic Web Conference (ISWC 2025) |
| Duration | 2 November 2025 |
| Website | |
| Location | Nara Prefectural Convention Center |
| City | Nara |
| Country | Japan |
External IDs
| ORCID | /0000-0002-5410-218X/work/194826582 |
|---|
Keywords
ASJC Scopus subject areas
Keywords
- Divide-and-Conquer, Knowledge base construction, Language models, LM-KBC, Self-RAG