Methods of spatial cluster detection in rare childhood cancers: Benchmarking data and results from a simulation study on nephroblastoma

Research output: Contribution to journalResearch articleContributedpeer-review

Contributors

  • Michael M. Schündeln - , University of Duisburg-Essen (Author)
  • Toni Lange - , Center for Evidence-Based Healthcare (Author)
  • Maximilian Knoll - , German Cancer Research Center (DKFZ) (Author)
  • Claudia Spix - , Johannes Gutenberg University Mainz (Author)
  • Hermann Brenner - , German Cancer Research Center (DKFZ) (Author)
  • Kayvan Bozorgmehr - , Bielefeld University (Author)
  • Christian Stock - , German Cancer Research Center (DKFZ), Heidelberg University  (Author)

Abstract

The potential existence of spatial clusters in childhood cancer incidence is a debated topic. Identification of rare disease clusters in general may help to better understand disease etiology and develop preventive strategies against such entities. The incidence of newly diagnosed childhood malignancies under 15 years of age is 140/1,000,000. In this context, the subgroup of nephroblastoma represents an extremely rare entity with an annual incidence of 7/1,000,000. We evaluated widely used statistical approaches for spatial cluster detection in childhood cancer (Ref. Schündeln et al., 2021, Cancer Epidemiology). For the simulation study, random high risk clusters of 1 to 50 adjacent districts (NUTS-level 3, nomenclature des unités territoriales statistiques) were generated on the basis of the 402 German administrative districts. Each cluster was simulated with different relative risk levels (1 to 100). For each combination of cluster size and risk level 2000 iterations were performed. Simulated data was then analyzed by three local clustering tests: Besag-Newell method, spatial scan statistic and the Bayesian Besag-York-Mollié approach (fit by Integrated Nested Laplace Approximation). The performance characteristics of all three methods were systematically documented (sensitivity, specificity, positive/negative predictive values, exact- and minimum power, correct classification, positive/negative diagnostic likelihood and false positive/negative rate). This data article links to a Mendeley online repository which includes the raw data of simulated high-risk clusters and simulated cases on the district level for an all-childhood-malignancy scenario as well as for cases of nephroblastoma. These data was used for the evaluation of the three cluster detection methods. The R code for simulation and analysis are available from GitHub. The article also includes analyzed data summarizing the performance of the cluster detection tests in very rare disease entities, using the example of simulated nephroblastoma cases. The raw data from the study can be used for benchmarking analyses applying different spatial statistical methods systematically and evaluating their performance characteristics comparatively. The analyzed data from the nephroblastoma example can be useful to interpret the performance of the three applied local cluster detection tests in the setting of extremely rare disease entities. As a practical application, data and R code can be used for performance analyses when planning to establish surveillance systems for rare disease entities.

Details

Original languageEnglish
Article number106683
JournalData in brief
Volume34
Publication statusPublished - Feb 2021
Peer-reviewedYes

External IDs

ORCID /0000-0002-8671-7496/work/152545137

Keywords

Sustainable Development Goals

ASJC Scopus subject areas

Keywords

  • Bayesian, Besag York Mollié, Besag-Newell, Childhood cancer, Nephroblastoma, Random distribution, Simulation study, Spatial cluster, Spatial scan statistic