SHARK: Web server for alignment-free homology assessment for intrinsically disordered and unalignable protein regions

Publikation: Beitrag in FachzeitschriftForschungsartikelBeigetragenBegutachtung

Beitragende

  • Chi Fung Willis Chow - , Max Planck Institute of Molecular Cell Biology and Genetics, Zentrum für Systembiologie Dresden (CSBD), Exzellenzcluster PoL: Physik des Lebens (Autor:in)
  • Maxim Scheremetjew - , Max Planck Institute of Molecular Cell Biology and Genetics, Zentrum für Systembiologie Dresden (CSBD) (Autor:in)
  • Hong Kee Moon - , Max Planck Institute of Molecular Cell Biology and Genetics, Zentrum für Systembiologie Dresden (CSBD) (Autor:in)
  • Soumyadeep Ghosh - , Max Planck Institute of Molecular Cell Biology and Genetics, Zentrum für Systembiologie Dresden (CSBD) (Autor:in)
  • Anna Hadarovich - , Max Planck Institute of Molecular Cell Biology and Genetics, Zentrum für Systembiologie Dresden (CSBD) (Autor:in)
  • Lena Hersemann - , Max Planck Institute of Molecular Cell Biology and Genetics, Zentrum für Systembiologie Dresden (CSBD) (Autor:in)
  • Agnes Toth-Petroczy - , Max Planck Institute of Molecular Cell Biology and Genetics, Zentrum für Systembiologie Dresden (CSBD), Exzellenzcluster PoL: Physik des Lebens (Autor:in)

Abstract

Whereas alignment has been fundamental to sequence-based assessments of protein homology, it is ineffective for intrinsically disordered regions (IDRs) due to their lowered sequence conservation and unique sequence properties. Here, we present a web server implementation of SHARK (bio-shark.org), an alignment-free algorithm for homology classification that compares the overall amino acid composition and short regions (k-mers) shared between sequences (SHARK-scores). The output of such k-mer-based comparisons is used by SHARK-dive, a machine learning classifier to detect homology between unalignable, disordered sequences. SHARK-web provides sequence-versus-database assessment of protein sequence homology akin to conventional tools such as BLAST and HMMER. Additionally, we provide precomputed sets of IDR sequences from 16 model organism proteomes facilitating searches against species-specific IDR-omes. SHARK-dive offers superior overall homology detection performance to BLAST and HMMER, driven by a large increase in sensitivity to low sequence identity homologs, and can be used to facilitate the study of sequence-function relationships in disordered, difficult-to-align regions.

Details

OriginalspracheEnglisch
Seiten (von - bis)W512-W519
FachzeitschriftNucleic acids research
Jahrgang53
AusgabenummerW1
PublikationsstatusVeröffentlicht - 7 Juli 2025
Peer-Review-StatusJa

Externe IDs

PubMed 40396357

Schlagworte

ASJC Scopus Sachgebiete