On the Consistency of Spatial Semantic Integrity Constraints
Research output: Types of thesis › Doctoral thesis
Contributors
Abstract
Geographical data are the core of any Geographical Information System (GIS) and any Geographic Information (GI) application. Because of the increasing use of decentrally held data
and networked services, detailed knowledge about the existing data (i.e., its origin, structure,
formats, quality, availability and reference applications) becomes more and more important.
The availability of such metadata and the evaluation of the fitness for use based on these
metadata are vital.
With this thesis the author intents to contribute to the development of meaningful and
machine-interpretable quality descriptions of GI. The work focuses on semantic integrity
constraints (SIC). In general, integrity constraints define basic assumptions on the part of
real world, which is represented by the data. They enable to detect inconsistencies, that is,
unacceptable differences between the data and the data model. SICs are defined as specific
integrity constraints, whose defined restrictions are based on the semantics of the modelled
entities. They reflect business, legal and other required rules and regulations in the database.
For spatial data, many SICs are based on spatial properties like topological or metric relations.
Reasoning on such spatial relations and the corresponding derivation of implicit knowledge
allow for many interesting applications.
Currently the potential of SICs is far from being exploited and SICs are hardly supported
by available GISs or spatial database systems. Their effective use mainly requires a formal
description of the constraints that enables to transfer and compare the sets of SICs of different
data sources. This thesis contributes to the second requirement. Currently, there is no solution
for the comparison of SICs pairs and the detection of any conflicts or redundancies in sets
of SICs. This also required the inference of implicit restrictions defined by the SICs. In
consequence, the quality assurance of a data set is possibly more extensive than necessary,
because sets of SICs might define redundant restrictions, the integration of SICs sets from
multiple data sources is impossible and the assessment of the fitness for use based on the
SICs cannot be supported. These are significant shortcomings for quality assurance and the
knowledge sharing within the frame of spatial data infrastructures.
5
Three major contributions are elaborated in the thesis: (i) a detailed categorisation of SICs,
(ii) a framework for the formal definition of SICs and (iii) a reasoning methodology for the
detection of conflicting and redundant SICs.
(i) The classification distinguishes the SICs according to the involved types of spatial and
non spatial relation and profoundly differentiates the properties and aspects restricted
by spatio-temporal SICs.
(ii) The framework for formal definition of SICs is based on a set of 17 class-level relations.
Such qualitative description of cardinality restrictions is novel. The definitions and
reasoning rules of the class relations are described independently of concrete spatial or
non-spatial relations, what makes them applicable for many types of SICs.
(iii) The introduced reasoning methodology enables for a detection of conflicts and redundancies in sets of SICs, which has hardly been a research topic before. The overall
reasoning algorithm is based on the symmetry, composition and conceptual neighbourhood of class relations.
The feasibility of the proposed algorithm has been verified through a prototypical implementation as a plug-in extension of the ontology modelling and knowledge acquisition platform
Protege. Possible application areas are quality assurance of geodata, geodata integration and
harmonisation, data modelling and ontology engineering, semantic similarity measurements
and usability evaluation
and networked services, detailed knowledge about the existing data (i.e., its origin, structure,
formats, quality, availability and reference applications) becomes more and more important.
The availability of such metadata and the evaluation of the fitness for use based on these
metadata are vital.
With this thesis the author intents to contribute to the development of meaningful and
machine-interpretable quality descriptions of GI. The work focuses on semantic integrity
constraints (SIC). In general, integrity constraints define basic assumptions on the part of
real world, which is represented by the data. They enable to detect inconsistencies, that is,
unacceptable differences between the data and the data model. SICs are defined as specific
integrity constraints, whose defined restrictions are based on the semantics of the modelled
entities. They reflect business, legal and other required rules and regulations in the database.
For spatial data, many SICs are based on spatial properties like topological or metric relations.
Reasoning on such spatial relations and the corresponding derivation of implicit knowledge
allow for many interesting applications.
Currently the potential of SICs is far from being exploited and SICs are hardly supported
by available GISs or spatial database systems. Their effective use mainly requires a formal
description of the constraints that enables to transfer and compare the sets of SICs of different
data sources. This thesis contributes to the second requirement. Currently, there is no solution
for the comparison of SICs pairs and the detection of any conflicts or redundancies in sets
of SICs. This also required the inference of implicit restrictions defined by the SICs. In
consequence, the quality assurance of a data set is possibly more extensive than necessary,
because sets of SICs might define redundant restrictions, the integration of SICs sets from
multiple data sources is impossible and the assessment of the fitness for use based on the
SICs cannot be supported. These are significant shortcomings for quality assurance and the
knowledge sharing within the frame of spatial data infrastructures.
5
Three major contributions are elaborated in the thesis: (i) a detailed categorisation of SICs,
(ii) a framework for the formal definition of SICs and (iii) a reasoning methodology for the
detection of conflicting and redundant SICs.
(i) The classification distinguishes the SICs according to the involved types of spatial and
non spatial relation and profoundly differentiates the properties and aspects restricted
by spatio-temporal SICs.
(ii) The framework for formal definition of SICs is based on a set of 17 class-level relations.
Such qualitative description of cardinality restrictions is novel. The definitions and
reasoning rules of the class relations are described independently of concrete spatial or
non-spatial relations, what makes them applicable for many types of SICs.
(iii) The introduced reasoning methodology enables for a detection of conflicts and redundancies in sets of SICs, which has hardly been a research topic before. The overall
reasoning algorithm is based on the symmetry, composition and conceptual neighbourhood of class relations.
The feasibility of the proposed algorithm has been verified through a prototypical implementation as a plug-in extension of the ontology modelling and knowledge acquisition platform
Protege. Possible application areas are quality assurance of geodata, geodata integration and
harmonisation, data modelling and ontology engineering, semantic similarity measurements
and usability evaluation
Details
Original language | English |
---|---|
Qualification level | Dr.-Ing. |
Awarding Institution |
|
Supervisors/Advisors |
|
Defense Date (Date of certificate) | 22 Dec 2009 |
Place of Publication | Heidelberg, Germany |
Publisher |
|
Print ISBNs | 978-3-89838-645-6 |
Publication status | Published - 2010 |
Externally published | Yes |
No renderer: customAssociatesEventsRenderPortal,dk.atira.pure.api.shared.model.researchoutput.Thesis
External IDs
ORCID | /0000-0002-9016-1996/work/155292042 |
---|