Unraveling protein networks with power graph analysis

Research output: Contribution to journalResearch articleContributedpeer-review

Contributors

Abstract

Networks play a crucial role in computational biology, yet their analysis and representation is still an open problem. Power Graph Analysis is a lossless transformation of biological networks into a compact, less redundant representation, exploiting the abundance of cliques and bicliques as elementary topological motifs. We demonstrate with five examples the advantages of Power Graph Analysis. Investigating protein-protein interaction networks, we show how the catalytic subunits of the casein kinase II complex are distinguishable from the regulatory subunits, how interaction profiles and sequence phylogeny of SH3 domains correlate, and how false positive interactions among high-throughput interactions are spotted. Additionally, we demonstrate the generality of Power Graph Analysis by applying it to two other types of networks. We show how power graphs induce a clustering of both transcription factors and target genes in bipartite transcription networks, and how the erosion of a phosphatase domain in type 22 non-receptor tyrosine phosphatases is detected. We apply Power Graph Analysis to high-throughput protein interaction networks and show that up to 85% (56% on average) of the information is redundant. Experimental networks are more compressible than rewired ones of same degree distribution, indicating that experimental networks are rich in cliques and bicliques. Power Graphs are a novel representation of networks, which reduces network complexity by explicitly representing re-occurring network motifs. Power Graphs compress up to 85% of the edges in protein interaction networks and are applicable to all types of networks such as protein interactions, regulatory networks, or homology networks.

Details

Original languageEnglish
Pages (from-to)e1000108
JournalPLOS computational biology
Volume4
Issue number7
Publication statusPublished - 11 Jul 2008
Peer-reviewedYes

External IDs

PubMedCentral PMC2424176
Scopus 48249137780
ORCID /0000-0003-2848-6949/work/141543399

Keywords

Keywords

  • Amino Acid Motifs/physiology, Animals, Binding Sites/physiology, Casein Kinase II/chemistry, Catalytic Domain, Cluster Analysis, Computational Biology/methods, Computer Simulation, Data Compression/methods, Evolution, Molecular, Humans, Models, Biological, Neural Networks, Computer, Protein Binding/genetics, Protein Interaction Mapping/methods, Protein Tyrosine Phosphatase, Non-Receptor Type 22/metabolism, Proteins/chemistry, Sequence Analysis, Protein/methods, Structural Homology, Protein, Transcription Factors/physiology, src Homology Domains/genetics

Library keywords