Puttingweb tables into context

Research output: Contribution to book/conference proceedings/anthology/reportConference contributionContributedpeer-review

Contributors

  • Katrin Braunschweig - , TUD Dresden University of Technology (Author)
  • Maik Thiele - , TUD Dresden University of Technology (Author)
  • Elvis Koci - , TUD Dresden University of Technology (Author)
  • Wolfgang Lehner - , Chair of Databases (Author)

Abstract

Web tables are a valuable source of information used in many application areas. However, to exploit Web tables it is necessary to understand their content and intention which is impeded by their ambiguous semantics and inconsistencies. Therefore, additional context information, e.g.Text in which the tables are embedded, is needed to support the table understanding process. In this paper, we propose a novel contextualization approach that 1) splits the table context in topically coherent paragraphs, 2) provides a similarity measure that is able to match each paragraph to the table in question and 3) ranks these paragraphs according to their relevance. Each step is accompanied by an experimental evaluation on real-world data showing that our approach is feasible and effectively identifies the most relevant context for a given Web table.

Details

Original languageEnglish
Title of host publicationKDIR 2016 - 8th International Conference on Knowledge Discovery and Information Retrieval
EditorsAna Fred, Jan Dietz, David Aveiro, Kecheng Liu, Jorge Bernardino, Joaquim Filipe, Joaquim Filipe
PublisherSCITEPRESS - Science and Technology Publications
Pages158-165
Number of pages8
ISBN (electronic)9789897582035
Publication statusPublished - 2016
Peer-reviewedYes

Conference

Title8th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management, IC3K 2016
Duration9 - 11 November 2016
CityPorto
CountryPortugal

External IDs

ORCID /0000-0001-8107-2775/work/142253534

Keywords

ASJC Scopus subject areas

Keywords

  • Information Extraction, Similarity Measures, Text Tiling, Web Tables