Generic schema descriptions for comma-separated values files of environmental data

Research output: Contribution to conferencesPaperContributedpeer-review

Contributors

Abstract

Comma-Separated Values (CSV) files are commonly used to publish data about environmental phenomena and environmental sensor
measurements. Due to its simplicity, this format has many advantages. However, at the same time there is no official standard for CSV and no
possibility to specify schematic constraints or other metadata. As a result, CSV files come in many variations and often with no metadata that
would support interpretation or further processing, analysis and visualization. In this paper, we propose a framework for the specification of
schema descriptions for CSV files as they are used in the environmental sciences. It allows to constrain the structure and content of a CSV file
and also to specify relations between files, for example when they are published in one data package. The framework is extensible, also to
other spatial data formats such as GeoTiff. The schema descriptions are encoded in JSON or XML to be published in the Web as a supplement
to the data. It comes as a lightweight solution that provides metadata required to publish OGC compliant services from CSV files. It helps to
overcome the heterogeneities of different data providers when exchanging environmental measurement data on the Web.
Keywords: tabular data, generic schema language, CSV, comma separated values, metadata.

Details

Original languageEnglish
Number of pages5
Publication statusPublished - 12 Jun 2018
Peer-reviewedYes

Conference

Title21th AGILE Conference on Geographic Information Science
SubtitleGeospatial Technologies for All
Abbreviated titleAGILE 2018
Conference number21
Duration12 - 15 June 2018
Website
CityLund
CountrySweden

External IDs

ORCID /0000-0002-9016-1996/work/153654756
ORCID /0000-0002-3085-7457/work/154192848