Assigning spectrum-specific P-values to protein identifications by mass spectrometry

Research output: Contribution to journalResearch articleContributedpeer-review

Contributors

  • Victor Spirin - , Harvard University (Author)
  • Alexander Shpunt - , Harvard University, Massachusetts Institute of Technology (MIT) (Author)
  • Jan Seebacher - , Harvard University (Author)
  • Marc Gentzel - , Max Planck Institute of Molecular Cell Biology and Genetics (Author)
  • Andrej Shevchenko - , Max Planck Institute of Molecular Cell Biology and Genetics (Author)
  • Steven Gygi - , Harvard University (Author)
  • Shamil Sunyaev - , Harvard University (Author)

Abstract

Motivation: Although many methods and statistical approaches have been developed for protein identification by mass spectrometry, the problem of accurate assessment of statistical significance of protein identifications remains an open question. The main issues are as follows: (i) statistical significance of inferring peptide from experimental mass spectra must be platform independent and spectrum specific and (ii) individual spectrum matches at the peptide level must be combined into a single statistical measure at the protein level. Results: We present a method and software to assign statistical significance to protein identifications from search engines for mass spectrometric data. The approach is based on asymptotic theory of order statistics. The parameters of the asymptotic distributions of identification scores are estimated for each spectrum individually. The method relies on new unbiased estimators for parameters of extreme value distribution. The estimated parameters are used to assign a spectrum-specific P-value to each peptide-spectrum match. The protein-level confidence measure combines P-values of peptide-to-spectrum matches. Conclusion: We extensively tested the method using triplicate mouse and yeast high-throughput proteomic experiments. The proposed statistical approach improves the sensitivity of protein identifications without compromising specificity. While the method was primarily designed to work with Mascot, it is platformindependent and is applicable to any search engine which outputs a single score for a peptide-spectrum match. We demonstrate this by testing the method in conjunction with X!Tandem.

Details

Original languageEnglish
Article numberbtr089
Pages (from-to)1128-1134
Number of pages7
JournalBioinformatics
Volume27
Issue number8
Publication statusPublished - Apr 2011
Peer-reviewedYes
Externally publishedYes

External IDs

PubMed 21349864
ORCID /0000-0002-4482-6010/work/142251024