Project description
The scoring functions to assess spectrum similarity play a crucial role in many computational mass spectrometry algorithms. These scoring functions can be used to compare an acquired MS/MS spectrum against two different types of target spectra: either against a theoretical MS/MS spectrum derived from a peptide from a sequence database, or against another, previously acquired acquired MS/MS spectrum. The former is typically encountered in database searching, while the latter is used in spectrum clustering, spectral library searching, or the detection of unique spectra between different data sets in comparative proteomics studies.
The most commonly used scoring functions in experimental versus theoretical spectrum matching could be divided into two groups:
- non-probabilistic (cross correlations which was used for SEQUEST)
- probabilistic (cumulative binomial probability derived scoring functions in Andromeda and MS Amanda)
Scoring functions for the comparison of two experimental spectra:
- Normalized dot product (most commonly used in spectrum library search algorithms such as SpectraST, BiblioSpec)
- Pearson’s and Spearman’s correlation coefficients
Avaliable scoring functions
This project contains the enlisted scoring functions in Project description. The scoring functions in order to compare an acquired MS/MS spectrum against:
- a theoretical spectrum via
- SEQUEST-like scoring function (non-probabilistic)
-
Andromeda-like scoring function (probabilistic)
- another acquired spectrum via
- Dot product
- Normalized dot product
- Normalized dot product with introducting weights from peak intensities
- Pearson’s r
- Spearman’s rho
- Mean Squared Error (MSE) (and also root MSE)
- Median Squared Error (MdSE) (and also root MdSE)
- Probabilistic scoring functon (including peak intensities)
Citation
Yılmaz et al: J Proteome Res., 2016, 15 (6), pp 1963–1970 (DOI:10.1021/acs.jproteome.6b00140)
If you use our differential pipeline stand-alone tool or GUI version, please include the reference above.
Download
Differential pipeline
The probabilistic scoring function was succesfully applied on the differential pipeline in order to compare two experimental data sets in a differential analysis.
A stand-alone program and the GUI version of this stand-alone program can be downloaded here.
A pairwise spectrum view GUI
A pairwise spectrum view GUI enables the manual inspection of how spectra actually look alike and can be downloaded here.
Book chapter
The scoring functions enlisted [Avaliable scoring functions] were used to evaluate their ability to assess spectrum similarity by evaluating them on one of the CPTAC data sets. This has been described in a book chapter.
The program to compare spectra with the avaliable scoring functions, against either theoretical or experimental spectra can be downloaded here.
The settings to perform spectrum comparison are in the bookChapter.properties file.
Usage
See the wiki for additional information on how to setup, run and configure spectrum comparison related projects.