From web tables to concepts: A semantic normalization approach

Publikation: Beitrag in Buch/Konferenzbericht/Sammelband/GutachtenBeitrag in KonferenzbandBeigetragenBegutachtung

Beitragende

Abstract

Relational Web tables, embedded in HTML or published on data platforms, have become an important resource for many applications, including question answering or entity augmentation. To utilize the data, we require some understanding of what the tables are about. Previous research on recovering Web table semantics has largely focused on simple tables, which only describe a single semantic concept. However, there is also a significant number of de-normalized multi-concept tables on theWeb. Treating these as single-concept tables results in many incorrect relations being extracted. In this paper, we propose a normalization approach to decompose multi-concept tables into smaller single-concept tables. First, we identify columns that represent keys or identifiers of entities. Then, we utilize the table schema as well as intrinsic data correlations to identify concept boundaries and split the tables accordingly. Experimental results on real Web tables show that our approach is feasible and effectively identifies semantic concepts.

Details

OriginalspracheEnglisch
TitelConceptual Modeling
Redakteure/-innenÓscar Pastor López, Mong Li Lee, Stephen W. Liddle, Paul Johannesson, Andreas L. Opdahl
Herausgeber (Verlag)Springer-Verlag
Seiten247-260
Seitenumfang14
ISBN (elektronisch)978-3-319-25264-3
ISBN (Print)978-3-319-25263-6
PublikationsstatusVeröffentlicht - 2015
Peer-Review-StatusJa

Publikationsreihe

ReiheLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Band9381
ISSN0302-9743

Konferenz

Titel34th International Conference on Conceptual Modeling, ER 2015
Dauer19 - 22 Oktober 2015
StadtStockholm
LandSchweden

Externe IDs

ORCID /0000-0001-8107-2775/work/199215561

Schlagworte

Forschungsprofillinien der TU Dresden

Fächergruppen, Lehr- und Forschungsbereiche, Fachgebiete nach Destatis

Schlagwörter

  • Conceptualization, Normalization, Semantics, Web tables