SalaciaML: A Deep Learning Approach for Supporting Ocean Data Quality Control

Research output: Contribution to journalResearch articleContributedpeer-review

Contributors

  • Sebastian Mieruch - , Alfred Wegener Institute - Helmholtz Centre for Polar and Marine Research (Author)
  • Serdar Demirel - , Alfred Wegener Institute - Helmholtz Centre for Polar and Marine Research, Wageningen University & Research (WUR) (Author)
  • Simona Simoncelli - , Istituto Nazionale Di Geofisica E Vulcanologia (Author)
  • Reiner Schlitzer - , Alfred Wegener Institute - Helmholtz Centre for Polar and Marine Research (Author)
  • Steffen Seitz - , Chair of Fundamentals of Electrical Engineering, TUD Dresden University of Technology (Author)

Abstract

We present a skillful deep learning algorithm for supporting quality control of ocean temperature measurements, which we name SalaciaML according to Salacia the roman goddess of sea waters. Classical attempts to algorithmically support and partly automate the quality control of ocean data profiles are especially helpful for the gross errors in the data. Range filters, spike detection, and data distribution checks remove reliably the outliers and errors in the data, still wrong classifications occur. Various automated quality control procedures have been successfully implemented within the main international and EU marine data infrastructures (WOD, CMEMS, IQuOD, SDN) but their resulting data products are still containing data anomalies, bad data flagged as good and vice-versa. They also include visual inspection of suspicious measurements, which is a time consuming activity, especially if the number of suspicious data detected is large. A deep learning approach could highly improve our capabilities to quality assess big data collections and contemporary reducing the human effort. Our algorithm SalaciaML is meant to complement classical automated quality control procedures in supporting the time consuming visually inspection of data anomalies by quality control experts. As a first approach we applied the algorithm to a large dataset from the Mediterranean Sea. SalaciaML has been able to detect correctly more than 90% of all good and/or bad data in 11 out of 16 Mediterranean regions.

Details

Original languageEnglish
Article number611742
JournalFrontiers in marine science
Volume8
Publication statusPublished - 28 Apr 2021
Peer-reviewedYes

External IDs

ORCID /0000-0002-8389-8869/work/154738711

Keywords

Sustainable Development Goals

Keywords

  • deep learning, Keras, ocean data view, ocean temperature profiles, quality control, SeaDataNet