Open Access
Open access
volume 40 issue Supplement_1 pages i410-i417

A learned score function improves the power of mass spectrometry database search

Publication typeJournal Article
Publication date2024-06-28
scimago Q1
wos Q1
SJR2.451
CiteScore9.6
Impact factor5.4
ISSN13674803, 13674811, 14602059
Abstract
Motivation

One of the core problems in the analysis of protein tandem mass spectrometry data is the peptide assignment problem: determining, for each observed spectrum, the peptide sequence that was responsible for generating the spectrum. Two primary classes of methods are used to solve this problem: database search and de novo peptide sequencing. State-of-the-art methods for de novo sequencing use machine learning methods, whereas most database search engines use hand-designed score functions to evaluate the quality of a match between an observed spectrum and a candidate peptide from the database. We hypothesized that machine learning models for de novo sequencing implicitly learn a score function that captures the relationship between peptides and spectra, and thus may be re-purposed as a score function for database search. Because this score function is trained from massive amounts of mass spectrometry data, it could potentially outperform existing, hand-designed database search tools.

Results

To test this hypothesis, we re-engineered Casanovo, which has been shown to provide state-of-the-art de novo sequencing capabilities, to assign scores to given peptide-spectrum pairs. We then evaluated the statistical power of this Casanovo score function, Casanovo-DB, to detect peptides on a benchmark of three mass spectrometry runs from three different species. In addition, we show that re-scoring with the Percolator post-processor benefits Casanovo-DB more than other score functions, further increasing the number of detected peptides.

Found 
Found 

Top-30

Journals

1
Analytical Chemistry
1 publication, 25%
Environmental Microbiomes
1 publication, 25%
1

Publishers

1
2
Cold Spring Harbor Laboratory
2 publications, 50%
American Chemical Society (ACS)
1 publication, 25%
Springer Nature
1 publication, 25%
1
2
  • We do not take into account publications without a DOI.
  • Statistics recalculated weekly.

Are you a researcher?

Create a profile to get free access to personal recommendations for colleagues and new articles.
Metrics
4
Share
Cite this
GOST |
Cite this
GOST Copy
Ananth V. et al. A learned score function improves the power of mass spectrometry database search // Bioinformatics. 2024. Vol. 40. No. Supplement_1. p. i410-i417.
GOST all authors (up to 50) Copy
Ananth V., Sanders J., Yilmaz M., Wen B., OH S., Noble W. J. A learned score function improves the power of mass spectrometry database search // Bioinformatics. 2024. Vol. 40. No. Supplement_1. p. i410-i417.
RIS |
Cite this
RIS Copy
TY - JOUR
DO - 10.1093/bioinformatics/btae218
UR - https://academic.oup.com/bioinformatics/article/40/Supplement_1/i410/7700854
TI - A learned score function improves the power of mass spectrometry database search
T2 - Bioinformatics
AU - Ananth, Varun
AU - Sanders, Justin
AU - Yilmaz, Melih
AU - Wen, Bo
AU - OH, Sewoong
AU - Noble, William J.
PY - 2024
DA - 2024/06/28
PB - Oxford University Press
SP - i410-i417
IS - Supplement_1
VL - 40
PMID - 38940129
SN - 1367-4803
SN - 1367-4811
SN - 1460-2059
ER -
BibTex |
Cite this
BibTex (up to 50 authors) Copy
@article{2024_Ananth,
author = {Varun Ananth and Justin Sanders and Melih Yilmaz and Bo Wen and Sewoong OH and William J. Noble},
title = {A learned score function improves the power of mass spectrometry database search},
journal = {Bioinformatics},
year = {2024},
volume = {40},
publisher = {Oxford University Press},
month = {jun},
url = {https://academic.oup.com/bioinformatics/article/40/Supplement_1/i410/7700854},
number = {Supplement_1},
pages = {i410--i417},
doi = {10.1093/bioinformatics/btae218}
}
MLA
Cite this
MLA Copy
Ananth, Varun, et al. “A learned score function improves the power of mass spectrometry database search.” Bioinformatics, vol. 40, no. Supplement_1, Jun. 2024, pp. i410-i417. https://academic.oup.com/bioinformatics/article/40/Supplement_1/i410/7700854.