Dispom: A discriminative de-novo motif discovery tool based on the jstacs library

Publikation: Beitrag in FachzeitschriftForschungsartikelBeigetragenBegutachtung

Beitragende

  • Jan Grau - , Martin-Luther-Universität Halle-Wittenberg (Autor:in)
  • Jens Keilwagen - , Leibniz-Institut für Pflanzengenetik und Kulturpflanzenforschung, Julius Kühn Institute - Federal Research Centre for Cultivated Plants (Autor:in)
  • André Gohr - , Martin-Luther-Universität Halle-Wittenberg (Autor:in)
  • Ivan A. Paponov - , Albert-Ludwigs-Universität Freiburg (Autor:in)
  • Stefan Posch - , Martin-Luther-Universität Halle-Wittenberg (Autor:in)
  • Michael Seifert - , Leibniz-Institut für Pflanzengenetik und Kulturpflanzenforschung (Autor:in)
  • Marc Strickert - , Philipps-Universität Marburg (Autor:in)
  • Ivo Grosse - , Martin-Luther-Universität Halle-Wittenberg (Autor:in)

Abstract

DNA-binding proteins are a main component of gene regulation as they activate or repress gene expression by binding to specific binding sites in target regions of genomic DNA. However, de-novo discovery of these binding sites in target regions obtained by wet-lab experiments is a challenging problem in computational biology, which has not yet been solved satisfactorily. Here, we present a detailed description and analysis of the de-novo motif discovery tool Dispom, which has been developed for finding binding sites of DNA-binding proteins that are differentially abundant in a set of target regions compared to a set of control regions. Two additional features of Dispom are its capability of modeling positional preferences of binding sites and adjusting the length of the motif in the learning process. Dispom yields an increased prediction accuracy compared to existing tools for de-novo motif discovery, suggesting that the combination of searching for differentially abundant motifs, inferring their positional distributions, and adjusting the motif lengths is beneficial for de-novo motif discovery. When applying Dispom to promoters of auxin-responsive genes and those of ABI3 target genes from Arabidopsis thaliana, we identify relevant binding motifs with pronounced positional distributions. These results suggest that learning motifs, their positional distributions, and their lengths by a discriminative learning principle may aid motif discovery from ChIP-chip and gene expression data. We make Dispom freely available as part of Jstacs, an open-source Java library that is tailored to statistical sequence analysis. To facilitate extensions of Dispom, we describe its implementation using Jstacs in this manuscript. In addition, we provide a stand-alone application of Dispom at for instant use.

Details

OriginalspracheEnglisch
Aufsatznummer1340006
Fachzeitschrift Journal of bioinformatics and computational biology : JBCB
Jahrgang11
Ausgabenummer1
PublikationsstatusVeröffentlicht - Feb. 2013
Peer-Review-StatusJa
Extern publiziertJa

Externe IDs

Scopus 84874803448
ORCID /0000-0002-2844-053X/work/153655342

Schlagworte

Schlagwörter

  • de-novo motif discovery, statistical models, Transcription factor binding sites