Combining Self-Retrieval-Augmented Generation with Divide-and-Conquer for Language Model-based Knowledge Base Construction
Publikation: Beitrag in Buch/Konferenzbericht/Sammelband/Gutachten › Beitrag in Konferenzband › Beigetragen › Begutachtung
Beitragende
Abstract
Knowledge base construction from language models (LMs) without external retrieval presents unique challenges. Therefore, we present a hybrid, LM-only system for the LM-KBC 2025 challenge [1], which requires constructing knowledge bases using a fixed model (Qwen3-8B) without fine-tuning or external retrieval. Our method combines Self-RAG for general relations with a divide-and-conquer module specialized for awardWonBy. Self-RAG follows a description-first, then extraction-second design with strict output specifications (names-only or one-number-only) to reduce reliance on brittle post-hoc cleaning; numeric answers are normalized to a canonical digit form. The divide-and-conquer module aggregates candidates from constrained, names-only subqueries and filters them with a strict name validator. Evaluation uses the organizers’ official string-matching metric. On the hidden test leaderboard, our system achieves the 2nd place out of 5 participants, and improves macro-F1 from 0.212 (baseline) to 0.405 (+0.194; ∼+91.5% relative improvement), with large gains on companyTradesAtStockExchange (+0.339), personHasCityOfDeath (+0.330), and countryLandBordersCountry (+0.162).
Details
| Originalsprache | Englisch |
|---|---|
| Titel | KBC-LM Workshop and LM-KBC Challenge at ISWC 2025 |
| Redakteure/-innen | Simon Razniewski, Jan-Christoph Kalo, Duygu Islakoğlu, Tuan-Phong Nguyen, Bohui Zhang |
| Seitenumfang | 20 |
| Publikationsstatus | Veröffentlicht - 2025 |
| Peer-Review-Status | Ja |
Publikationsreihe
| Reihe | CEUR Workshop Proceedings |
|---|---|
| Band | 4041 |
| ISSN | 1613-0073 |
Sonstiges
| Titel | 4th challenge on Knowledge Base Construction from Pre-trained Language Models |
|---|---|
| Kurztitel | LM-KBC 2025 |
| Veranstaltungsnummer | 4 |
| Beschreibung | co-located with the 24th International Semantic Web Conference (ISWC 2025) |
| Dauer | 2 November 2025 |
| Webseite | |
| Ort | Nara Prefectural Convention Center |
| Stadt | Nara |
| Land | Japan |
Externe IDs
| ORCID | /0000-0002-5410-218X/work/194826582 |
|---|
Schlagworte
ASJC Scopus Sachgebiete
Schlagwörter
- Divide-and-Conquer, Knowledge base construction, Language models, LM-KBC, Self-RAG