Reproducible variability: assessing investigator discordance across 9 research teams attempting to reproduce the same observational study

Publikation: Beitrag in FachzeitschriftForschungsartikelBeigetragenBegutachtung

Beitragende

  • Anna Ostropolets - , Columbia University Irving Medical Center (CUIMC) (Autor:in)
  • Yasser Albogami - , King Saud University (Autor:in)
  • Mitchell Conover - , Janssen Pharmaceutica NV (Autor:in)
  • Juan M Banda - , Middle Georgia State University (Autor:in)
  • William A Baumgartner - , University of Colorado Anschutz Medical Campus (Autor:in)
  • Clair Blacketer - , Janssen Pharmaceutica NV (Autor:in)
  • Priyamvada Desai - , IBM Research Zurich (Autor:in)
  • Scott L DuVall - , VA Salt Lake City Health Care System (Autor:in)
  • Stephen Fortin - , Janssen Pharmaceutica NV (Autor:in)
  • James P Gilbert - , Janssen Pharmaceutica NV (Autor:in)
  • Asieh Golozar - , Odysseus Data Services GmbH (Autor:in)
  • Joshua Ide - , Johnson & Johnson (Autor:in)
  • Andrew S Kanter - , Columbia University Irving Medical Center (CUIMC) (Autor:in)
  • David M Kern - , Janssen Pharmaceutica NV (Autor:in)
  • Chungsoo Kim - , Ajou University (Autor:in)
  • Lana Y H Lai - , University of Manchester (Autor:in)
  • Chenyu Li - , University of Pittsburgh (Autor:in)
  • Feifan Liu - , University of Massachusetts Medical School (Autor:in)
  • Kristine E Lynch - , VA Salt Lake City Health Care System (Autor:in)
  • Evan Minty - , University of Calgary (Autor:in)
  • Maria Inês Neves - , Real World Solutions (Autor:in)
  • Ding Quan Ng - , University of California at Irvine (Autor:in)
  • Tontel Obene - , Jackson State University (Autor:in)
  • Victor Pera - , Erasmus University Medical Center (Autor:in)
  • Nicole Pratt - , Flinders University (Autor:in)
  • Gowtham Rao - , Janssen Pharmaceutica NV (Autor:in)
  • Nadav Rappoport - , Ben-Gurion University of the Negev (Autor:in)
  • Ines Reinecke - , Institut für Medizinische Informatik und Biometrie (Autor:in)
  • Paola Saroufim - , Case Western Reserve University (Autor:in)
  • Azza Shoaibi - , Janssen Pharmaceutica NV (Autor:in)
  • Katherine Simon - , Vanderbilt University (Autor:in)
  • Marc A Suchard - , University of California at Irvine (Autor:in)
  • Joel N Swerdel - , Janssen Pharmaceutica NV (Autor:in)
  • Erica A Voss - , Janssen Pharmaceutica NV (Autor:in)
  • James Weaver - , Janssen Pharmaceutica NV (Autor:in)
  • Linying Zhang - , Columbia University Irving Medical Center (CUIMC) (Autor:in)
  • George Hripcsak - , Columbia University Irving Medical Center (CUIMC) (Autor:in)
  • Patrick B Ryan - , Columbia University Irving Medical Center (CUIMC) (Autor:in)

Abstract

OBJECTIVE: Observational studies can impact patient care but must be robust and reproducible. Nonreproducibility is primarily caused by unclear reporting of design choices and analytic procedures. This study aimed to: (1) assess how the study logic described in an observational study could be interpreted by independent researchers and (2) quantify the impact of interpretations' variability on patient characteristics.

MATERIALS AND METHODS: Nine teams of highly qualified researchers reproduced a cohort from a study by Albogami et al. The teams were provided the clinical codes and access to the tools to create cohort definitions such that the only variable part was their logic choices. We executed teams' cohort definitions against the database and compared the number of subjects, patient overlap, and patient characteristics.

RESULTS: On average, the teams' interpretations fully aligned with the master implementation in 4 out of 10 inclusion criteria with at least 4 deviations per team. Cohorts' size varied from one-third of the master cohort size to 10 times the cohort size (2159-63 619 subjects compared to 6196 subjects). Median agreement was 9.4% (interquartile range 15.3-16.2%). The teams' cohorts significantly differed from the master implementation by at least 2 baseline characteristics, and most of the teams differed by at least 5.

CONCLUSIONS: Independent research teams attempting to reproduce the study based on its free-text description alone produce different implementations that vary in the population size and composition. Sharing analytical code supported by a common data model and open-source tools allows reproducing a study unambiguously thereby preserving initial design choices.

Details

OriginalspracheEnglisch
Seiten (von - bis)859-868
Seitenumfang10
FachzeitschriftJournal of the American Medical Informatics Association : JAMIA
Jahrgang30(2023)
Ausgabenummer5
PublikationsstatusVeröffentlicht - 24 Feb. 2023
Peer-Review-StatusJa

Externe IDs

PubMedCentral PMC10114120
Scopus 85152976981
ORCID /0000-0003-0154-2867/work/143494695

Schlagworte

Schlagwörter

  • Humans, Research Personnel, Databases, Factual