Device-Driven Metadata Management Solutions for Scientific Big Data Use Cases.

Research output: Contribution to book/conference proceedings/anthology/reportConference contributionContributedpeer-review

Contributors

Abstract

Big Data applications in science are producing huge amounts of data, which require advanced processing, handling, and analysis capabilities. For the organization of large scale data sets it is essential to annotate these with metadata, index them, and make them easily findable. In this paper we investigate two scientific use cases from biology and photon science, which entail complex situations in regard to data volume, data rates and analysis requirements. The LSDMA project provides an ideal context for this research, combining both innovative R&D on the processing, handling, and analysis level and a wide range of research communities in need of scalable solutions. To facilitate the advancement of data life cycles we present preferred metadata management strategies. In biology the Open Microscopy Environment (OME) and in photon science NeXus/ICAT are presented. We show that these are well suited for the respective data life cycles. To facilitate searching across communities we discuss solutions involving the Open Archive Initiative - Protocol for Metadata Harvesting (OAI-PMH) and Apache Lucene/Solr.

Details

Original languageEnglish
Title of host publication2014 22nd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing
PublisherIEEE Computer Society, Washington
Pages317-321
Number of pages5
ISBN (electronic)978-1-4799-2729-6
Publication statusPublished - Feb 2014
Peer-reviewedYes

External IDs

Scopus 84899411424
dblp conf/pdp/GrunzkeHSKGHHKPHMJ14

Keywords

Research priority areas of TU Dresden

Keywords

  • Metadata, Management, Big Data