Hamiltonian Monte Carlo with strict convergence criteria reduces run-to-run variability in forensic DNA mixture deconvolution
Publikation: Beitrag in Fachzeitschrift › Forschungsartikel › Beigetragen › Begutachtung
Beitragende
Abstract
Motivation: Analysing mixed DNA profiles is a common task in forensic genetics. Due to the complexity of the data, such analysis is often performed using Markov Chain Monte Carlo (MCMC)-based genotyping algorithms. These trade off precision against execution time. When default settings (including default chain lengths) are used, as large as a 10-fold changes in inferred log-likelihood ratios (LR) are observed when the software is run twice on the same case. So far, this uncertainty has been attributed to the stochasticity of MCMC algorithms. Since LRs translate directly to strength of the evidence in a criminal trial, forensic laboratories desire LR with small run-to-run variability. Results: We present the use of a Hamiltonian Monte Carlo (HMC) algorithm that reduces run-to-run variability in forensic DNA mixture deconvolution by around an order of magnitude without increased runtime. We achieve this by enforcing strict convergence criteria. We show that the choice of convergence metric strongly influences precision. We validate our method by reproducing previously published results for benchmark DNA mixtures (MIX05, MIX13, and ProvedIt). We also present a complete software implementation of our algorithm that is able to leverage GPU acceleration for the inference process. In the benchmark mixtures, on consumer-grade hardware, the runtime is less than 7 min for 3 contributors, less than 35 min for 4 contributors, and less than an hour for 5 contributors with one known contributor.
Details
Originalsprache | Englisch |
---|---|
Aufsatznummer | 102744 |
Seitenumfang | 9 |
Fachzeitschrift | Forensic science international : official journal of the International Society for Forensic Genetics |
Jahrgang | 60 |
Publikationsstatus | Veröffentlicht - Sept. 2022 |
Peer-Review-Status | Ja |
Externe IDs
Scopus | 85134626420 |
---|---|
unpaywall | 10.1016/j.fsigen.2022.102744 |
WOS | 000835763000002 |
ORCID | /0000-0003-4414-4340/work/142252174 |
Schlagworte
Forschungsprofillinien der TU Dresden
DFG-Fachsystematik nach Fachkollegium
- Interaktive und intelligente Systeme, Bild- und Sprachverarbeitung, Computergraphik und Visualisierung
- Massiv parallele und datenintensive Systeme
- Bioinformatik und Theoretische Biologie
- Statistische Physik, Weiche Materie, Biologische Physik, Nichtlineare Dynamik
- Entwicklungsbiologie
- Softwaretechnik und Programmiersprachen
- Zellbiologie
- Biophysik
- Mathematik
Fächergruppen, Lehr- und Forschungsbereiche, Fachgebiete nach Destatis
Ziele für nachhaltige Entwicklung
Schlagwörter
- Probabilistic genotyping, Hamiltonian Monte Carlo, Bayesian inference, Precision, Gelman–Rubin convergence diagnostic, Gelman-Rubin convergence diagnostic