ECCsplorer: a pipeline to detect extrachromosomal circular DNA (eccDNA) from next-generation sequencing data

Research output: Contribution to journalResearch articleContributedpeer-review

Abstract

BACKGROUND: Extrachromosomal circular DNAs (eccDNAs) are ring-like DNA structures physically separated from the chromosomes with 100 bp to several megabasepairs in size. Apart from carrying tandemly repeated DNA, eccDNAs may also harbor extra copies of genes or recently activated transposable elements. As eccDNAs occur in all eukaryotes investigated so far and likely play roles in stress, cancer, and aging, they have been prime targets in recent research-with their investigation limited by the scarcity of computational tools.

RESULTS: Here, we present the ECCsplorer, a bioinformatics pipeline to detect eccDNAs in any kind of organism or tissue using next-generation sequencing techniques. Following Illumina-sequencing of amplified circular DNA (circSeq), the ECCsplorer enables an easy and automated discovery of eccDNA candidates. The data analysis encompasses two major procedures: first, read mapping to the reference genome allows the detection of informative read distributions including high coverage, discordant mapping, and split reads. Second, reference-free comparison of read clusters from amplified eccDNA against control sample data reveals specifically enriched DNA circles. Both software parts can be run separately or jointly, depending on the individual aim or data availability. To illustrate the wide applicability of our approach, we analyzed semi-artificial and published circSeq data from the model organisms Homo sapiens and Arabidopsis thaliana, and generated circSeq reads from the non-model crop plant Beta vulgaris. We clearly identified eccDNA candidates from all datasets, with and without reference genomes. The ECCsplorer pipeline specifically detected mitochondrial mini-circles and retrotransposon activation, showcasing the ECCsplorer's sensitivity and specificity.

CONCLUSION: The ECCsplorer (available online at https://github.com/crimBubble/ECCsplorer ) is a bioinformatics pipeline to detect eccDNAs in any kind of organism or tissue using next-generation sequencing data. The derived eccDNA targets are valuable for a wide range of downstream investigations-from analysis of cancer-related eccDNAs over organelle genomics to identification of active transposable elements.

Details

Original languageEnglish
Article number40
Pages (from-to)23-40
Number of pages15
JournalBMC bioinformatics
Volume23
Issue number1
Publication statusPublished - 14 Jan 2022
Peer-reviewedYes

External IDs

PubMedCentral PMC8760651
Scopus 85122998960
Mendeley 9c4163ff-9f24-3b60-85f7-dc2c2ff0c880
ORCID /0000-0001-8756-8106/work/142240007

Keywords

DFG Classification of Subject Areas according to Review Boards

Subject groups, research areas, subject areas according to Destatis

Keywords

  • Chromosomes, Cytoplasm, DNA/genetics, DNA, Circular/genetics, High-Throughput Nucleotide Sequencing, Humans, Extrachromosomal circular DNA, circSeq, eccDNA, Mobilome-seq, circDNA, TE activity, Mitochondrial minicircle