Generic schema descriptions for comma-separated values files of environmental data
Research output: Contribution to conferences › Paper › Contributed › peer-review
Contributors
Abstract
Comma-Separated Values (CSV) files are commonly used to publish data about environmental phenomena and environmental sensor
measurements. Due to its simplicity, this format has many advantages. However, at the same time there is no official standard for CSV and no
possibility to specify schematic constraints or other metadata. As a result, CSV files come in many variations and often with no metadata that
would support interpretation or further processing, analysis and visualization. In this paper, we propose a framework for the specification of
schema descriptions for CSV files as they are used in the environmental sciences. It allows to constrain the structure and content of a CSV file
and also to specify relations between files, for example when they are published in one data package. The framework is extensible, also to
other spatial data formats such as GeoTiff. The schema descriptions are encoded in JSON or XML to be published in the Web as a supplement
to the data. It comes as a lightweight solution that provides metadata required to publish OGC compliant services from CSV files. It helps to
overcome the heterogeneities of different data providers when exchanging environmental measurement data on the Web.
Keywords: tabular data, generic schema language, CSV, comma separated values, metadata.
measurements. Due to its simplicity, this format has many advantages. However, at the same time there is no official standard for CSV and no
possibility to specify schematic constraints or other metadata. As a result, CSV files come in many variations and often with no metadata that
would support interpretation or further processing, analysis and visualization. In this paper, we propose a framework for the specification of
schema descriptions for CSV files as they are used in the environmental sciences. It allows to constrain the structure and content of a CSV file
and also to specify relations between files, for example when they are published in one data package. The framework is extensible, also to
other spatial data formats such as GeoTiff. The schema descriptions are encoded in JSON or XML to be published in the Web as a supplement
to the data. It comes as a lightweight solution that provides metadata required to publish OGC compliant services from CSV files. It helps to
overcome the heterogeneities of different data providers when exchanging environmental measurement data on the Web.
Keywords: tabular data, generic schema language, CSV, comma separated values, metadata.
Details
Original language | English |
---|---|
Number of pages | 5 |
Publication status | Published - 12 Jun 2018 |
Peer-reviewed | Yes |
Conference
Title | 21th AGILE Conference on Geographic Information Science |
---|---|
Subtitle | Geospatial Technologies for All |
Abbreviated title | AGILE 2018 |
Conference number | 21 |
Duration | 12 - 15 June 2018 |
Website | |
City | Lund |
Country | Sweden |
External IDs
ORCID | /0000-0002-9016-1996/work/153654756 |
---|---|
ORCID | /0000-0002-3085-7457/work/154192848 |