Representing data quality for streaming and static data

Research output: Contribution to book/Conference proceedings/Anthology/ReportConference contributionContributedpeer-review

Contributors

  • Anja Klein - , SAP Research (Author)
  • Hong Hai Do - , SAP Research (Author)
  • Gregor Hackenbroich - , SAP Research (Author)
  • Marcel Karnstedt - , Ilmenau University of Technology (Author)
  • Wolfgang Lehner - , Chair of Databases (Author)

Abstract

In smart item environments, multitude of sensors are applied to capture data about product conditions and usage to guide business decisions as well as production automation processes. A big issue in this application area is posed by the restricted quality of sensor data due to limited sensor precision as well as sensor failures and malfunctions. Decisions derived on incorrect or misleading sensor data are likely to be faulty. The issue of how to efficiently provide applications with information about data quality (DQ) is still an open research problem. In this paper, we present a flexible model for the efficient transfer and management of data quality for streaming as well as static data. We propose a data stream metamodel to allow for the propagation of data quality from the sensors up to the respective business application without a significant overhead of data. Furthermore, we present the extension of the traditional RDBMS metamodel to permit the persistent storage of data quality information in a relational database. Finally, we demonstrate a data quality metadata mapping to close the gap between the streaming environment and the target database. Our solution maintains a flexible number of DQ dimensions and supports applications directly consuming streaming data or processing data filed in a persistent database.

Details

Original languageEnglish
Title of host publicationWorkshops in Conjunction with the International Conference on Data Engineering - ICDE' 07
Pages3-10
Number of pages8
Publication statusPublished - 2007
Peer-reviewedYes

Publication series

SeriesProceedings - International Conference on Data Engineering
ISSN1084-4627

Conference

TitleWorkshops in Conjunction with the 23rd International Conference on Data Engineering
Abbreviated titleICDE 2007
Duration15 - 20 April 2007
CityIstanbul
CountryTurkey

External IDs

ORCID /0000-0001-8107-2775/work/200630403

Keywords

Research priority areas of TU Dresden

DFG Classification of Subject Areas according to Review Boards

Subject groups, research areas, subject areas according to Destatis