The role of metadata in reproducible computational research

Research output: Contribution to journalReview articleContributedpeer-review

Contributors

  • Jeremy Leipzig - (Author)
  • Daniel Nüst - , University of Münster (Author)
  • Charles Tapley Hoyt - (Author)
  • Karthik Ram - (Author)
  • Jane Greenberg - (Author)

Abstract

Reproducible computational research (RCR) is the keystone of the scientific method for in silico analyses, packaging the transformation of raw data to published results. In addition to its role in research integrity, improving the reproducibility of scientific studies can accelerate evaluation and reuse. This potential and wide support for the FAIR principles have motivated interest in metadata standards supporting reproducibility. Metadata provide context and provenance to raw data and methods and are essential to both discovery and validation. Despite this shared connection with scientific data, few studies have explicitly described how metadata enable reproducible computational research. This review employs a functional content analysis to identify metadata standards that support reproducibility across an analytic stack consisting of input data, tools, notebooks, pipelines, and publications. Our review provides background context, explores gaps, and discovers component trends of embeddedness and methodology weight from which we derive recommendations for future work.

Details

Original languageEnglish
Article number100322
JournalPatterns
Volume2
Issue number9
Publication statusPublished - 10 Sept 2021
Peer-reviewedYes
Externally publishedYes

External IDs

Scopus 85120051431
ORCID /0000-0002-0024-5046/work/142255085
PubMed 34553169

Keywords

Library keywords