Deep learning tools predict variants in disordered regions with lower sensitivity

Research output: Contribution to journalResearch articleContributedpeer-review

Contributors

  • Federica Luppino - , Max Planck Institute of Molecular Cell Biology and Genetics, Center for Systems Biology Dresden (CSBD) (Author)
  • Swantje Lenz - , Max Planck Institute of Molecular Cell Biology and Genetics, Center for Systems Biology Dresden (CSBD) (Author)
  • Chi Fung Willis Chow - , Clusters of Excellence PoL: Physics of Life, Max Planck Institute of Molecular Cell Biology and Genetics, Center for Systems Biology Dresden (CSBD) (Author)
  • Agnes Toth-Petroczy - , Max Planck Institute of Molecular Cell Biology and Genetics, Center for Systems Biology Dresden (CSBD), Clusters of Excellence PoL: Physics of Life (Author)

Abstract

Background: The recent AI breakthrough of AlphaFold2 has revolutionized 3D protein structural modeling, proving crucial for protein design and variant effects prediction. However, intrinsically disordered regions—known for their lack of well-defined structure and lower sequence conservation—often yield low-confidence models. The latest Variant Effect Predictor (VEP), AlphaMissense, leverages AlphaFold2 models, achieving over 90% sensitivity and specificity in predicting variant effects. However, the effectiveness of tools for variants in disordered regions, which account for 30% of the human proteome, remains unclear. Results: In this study, we found that predicting pathogenicity for variants in disordered regions is less accurate than in ordered regions, particularly for mutations at the first N-Methionine site. Investigations into the efficacy of variant effect predictors on intrinsically disordered regions (IDRs) indicated that mutations in IDRs are predicted with lower sensitivity and the gap between sensitivity and specificity is largest in disordered regions, especially for AlphaMissense and VARITY. Conclusions: The prevalence of IDRs within the human proteome, coupled with the increasing repertoire of biological functions they are known to perform, necessitated an investigation into the efficacy of state-of-the-art VEPs on such regions. This analysis revealed their consistently reduced sensitivity and differing prediction performance profile to ordered regions, indicating that new IDR-specific features and paradigms are needed to accurately classify disease mutations within those regions.

Details

Original languageEnglish
Article number367
Number of pages16
JournalBMC genomics
Volume26
Issue number1
Publication statusPublished - 12 Apr 2025
Peer-reviewedYes

External IDs

PubMed 40221640

Keywords

Sustainable Development Goals

ASJC Scopus subject areas

Keywords

  • AlphaMissense, Benchmarking, Intrinsically disordered regions, Methionine start site, Variant effect predictors