Dispom: A discriminative de-novo motif discovery tool based on the jstacs library

Research output: Contribution to journalResearch articleContributedpeer-review

Contributors

  • Jan Grau - , Martin Luther University Halle-Wittenberg (Author)
  • Jens Keilwagen - , Leibniz Institute of Plant Genetics and Crop Plant Research, Julius Kühn Institute - Federal Research Centre for Cultivated Plants (Author)
  • André Gohr - , Martin Luther University Halle-Wittenberg (Author)
  • Ivan A. Paponov - , University of Freiburg (Author)
  • Stefan Posch - , Martin Luther University Halle-Wittenberg (Author)
  • Michael Seifert - , Leibniz Institute of Plant Genetics and Crop Plant Research (Author)
  • Marc Strickert - , University of Marburg (Author)
  • Ivo Grosse - , Martin Luther University Halle-Wittenberg (Author)

Abstract

DNA-binding proteins are a main component of gene regulation as they activate or repress gene expression by binding to specific binding sites in target regions of genomic DNA. However, de-novo discovery of these binding sites in target regions obtained by wet-lab experiments is a challenging problem in computational biology, which has not yet been solved satisfactorily. Here, we present a detailed description and analysis of the de-novo motif discovery tool Dispom, which has been developed for finding binding sites of DNA-binding proteins that are differentially abundant in a set of target regions compared to a set of control regions. Two additional features of Dispom are its capability of modeling positional preferences of binding sites and adjusting the length of the motif in the learning process. Dispom yields an increased prediction accuracy compared to existing tools for de-novo motif discovery, suggesting that the combination of searching for differentially abundant motifs, inferring their positional distributions, and adjusting the motif lengths is beneficial for de-novo motif discovery. When applying Dispom to promoters of auxin-responsive genes and those of ABI3 target genes from Arabidopsis thaliana, we identify relevant binding motifs with pronounced positional distributions. These results suggest that learning motifs, their positional distributions, and their lengths by a discriminative learning principle may aid motif discovery from ChIP-chip and gene expression data. We make Dispom freely available as part of Jstacs, an open-source Java library that is tailored to statistical sequence analysis. To facilitate extensions of Dispom, we describe its implementation using Jstacs in this manuscript. In addition, we provide a stand-alone application of Dispom at for instant use.

Details

Original languageEnglish
Article number1340006
Journal Journal of bioinformatics and computational biology : JBCB
Volume11
Issue number1
Publication statusPublished - Feb 2013
Peer-reviewedYes
Externally publishedYes

External IDs

Scopus 84874803448
ORCID /0000-0002-2844-053X/work/153655342

Keywords

Keywords

  • de-novo motif discovery, statistical models, Transcription factor binding sites