Statistical control of peptide and protein error rates in large-scale targeted data-independent acquisition analyses.

Statistical control of peptide and protein error rates in large-scale targeted data-independent acquisition analyses. Rosenberger, G., Bludau, I., Schmitt, U., Heusel, M., Hunter, C. L., Liu, Y., MacCoss, M. J., MacLean, B. X., Nesvizhskii, A. I., Pedrioli, P. G. A., Reiter, L., Rost, H. L., Tate, S., Ting, Y. S., Collins, B. C., & Aebersold, R. Nature methods, 14(9):921–927, September, 2017.
doi abstract bibtex

Liquid chromatography coupled to tandem mass spectrometry (LC-MS/MS) is the main method for high-throughput identification and quantification of peptides and inferred proteins. Within this field, data-independent acquisition (DIA) combined with peptide-centric scoring, as exemplified by the technique SWATH-MS, has emerged as a scalable method to achieve deep and consistent proteome coverage across large-scale data sets. We demonstrate that statistical concepts developed for discovery proteomics based on spectrum-centric scoring can be adapted to large-scale DIA experiments that have been analyzed with peptide-centric scoring strategies, and we provide guidance on their application. We show that optimal tradeoffs between sensitivity and specificity require careful considerations of the relationship between proteins in the samples and proteins represented in the spectral library. We propose the application of a global analyte constraint to prevent the accumulation of false positives across large-scale data sets. Furthermore, to increase the quality and reproducibility of published proteomic results, well-established confidence criteria should be reported for the detected peptide queries, peptides and inferred proteins.

@article{rosenberger_statistical_2017,
	title = {Statistical control of peptide and protein error rates in large-scale targeted data-independent acquisition analyses.},
	volume = {14},
	issn = {1548-7105 1548-7091},
	doi = {10.1038/nmeth.4398},
	abstract = {Liquid chromatography coupled to tandem mass spectrometry (LC-MS/MS) is the main  method for high-throughput identification and quantification of peptides and inferred proteins. Within this field, data-independent acquisition (DIA) combined with peptide-centric scoring, as exemplified by the technique SWATH-MS, has emerged as a scalable method to achieve deep and consistent proteome coverage across large-scale data sets. We demonstrate that statistical concepts developed  for discovery proteomics based on spectrum-centric scoring can be adapted to large-scale DIA experiments that have been analyzed with peptide-centric scoring  strategies, and we provide guidance on their application. We show that optimal tradeoffs between sensitivity and specificity require careful considerations of the relationship between proteins in the samples and proteins represented in the  spectral library. We propose the application of a global analyte constraint to prevent the accumulation of false positives across large-scale data sets. Furthermore, to increase the quality and reproducibility of published proteomic results, well-established confidence criteria should be reported for the detected peptide queries, peptides and inferred proteins.},
	language = {eng},
	number = {9},
	journal = {Nature methods},
	author = {Rosenberger, George and Bludau, Isabell and Schmitt, Uwe and Heusel, Moritz and Hunter, Christie L. and Liu, Yansheng and MacCoss, Michael J. and MacLean, Brendan X. and Nesvizhskii, Alexey I. and Pedrioli, Patrick G. A. and Reiter, Lukas and Rost, Hannes L. and Tate, Stephen and Ting, Ying S. and Collins, Ben C. and Aebersold, Ruedi},
	month = sep,
	year = {2017},
	pmid = {28825704},
	pmcid = {PMC5581544},
	keywords = {*Data Interpretation, Statistical, Computer Simulation, High-Throughput Screening Assays/*methods, Mass Spectrometry/*methods, Models, Statistical, Peptide Mapping/*methods, Proteins/analysis/*chemistry, Reproducibility of Results, Sensitivity and Specificity, Sequence Analysis, Protein/*methods},
	pages = {921--927}
}

Downloads: 0

{"_id":"v3ZGmWCXsBReSsaXk","bibbaseid":"rosenberger-bludau-schmitt-heusel-hunter-liu-maccoss-maclean-etal-statisticalcontrolofpeptideandproteinerrorratesinlargescaletargeteddataindependentacquisitionanalyses-2017","downloads":0,"creationDate":"2019-01-31T19:57:50.699Z","title":"Statistical control of peptide and protein error rates in large-scale targeted data-independent acquisition analyses.","author_short":["Rosenberger, G.","Bludau, I.","Schmitt, U.","Heusel, M.","Hunter, C. L.","Liu, Y.","MacCoss, M. J.","MacLean, B. X.","Nesvizhskii, A. I.","Pedrioli, P. G. A.","Reiter, L.","Rost, H. L.","Tate, S.","Ting, Y. S.","Collins, B. C.","Aebersold, R."],"year":2017,"bibtype":"article","biburl":"https://api.zotero.org/groups/2283367/items?key=x55htG8stHNuPk22YQR31JQa&format=bibtex&limit=100","bibdata":{"bibtype":"article","type":"article","title":"Statistical control of peptide and protein error rates in large-scale targeted data-independent acquisition analyses.","volume":"14","issn":"1548-7105 1548-7091","doi":"10.1038/nmeth.4398","abstract":"Liquid chromatography coupled to tandem mass spectrometry (LC-MS/MS) is the main method for high-throughput identification and quantification of peptides and inferred proteins. Within this field, data-independent acquisition (DIA) combined with peptide-centric scoring, as exemplified by the technique SWATH-MS, has emerged as a scalable method to achieve deep and consistent proteome coverage across large-scale data sets. We demonstrate that statistical concepts developed for discovery proteomics based on spectrum-centric scoring can be adapted to large-scale DIA experiments that have been analyzed with peptide-centric scoring strategies, and we provide guidance on their application. We show that optimal tradeoffs between sensitivity and specificity require careful considerations of the relationship between proteins in the samples and proteins represented in the spectral library. We propose the application of a global analyte constraint to prevent the accumulation of false positives across large-scale data sets. Furthermore, to increase the quality and reproducibility of published proteomic results, well-established confidence criteria should be reported for the detected peptide queries, peptides and inferred proteins.","language":"eng","number":"9","journal":"Nature methods","author":[{"propositions":[],"lastnames":["Rosenberger"],"firstnames":["George"],"suffixes":[]},{"propositions":[],"lastnames":["Bludau"],"firstnames":["Isabell"],"suffixes":[]},{"propositions":[],"lastnames":["Schmitt"],"firstnames":["Uwe"],"suffixes":[]},{"propositions":[],"lastnames":["Heusel"],"firstnames":["Moritz"],"suffixes":[]},{"propositions":[],"lastnames":["Hunter"],"firstnames":["Christie","L."],"suffixes":[]},{"propositions":[],"lastnames":["Liu"],"firstnames":["Yansheng"],"suffixes":[]},{"propositions":[],"lastnames":["MacCoss"],"firstnames":["Michael","J."],"suffixes":[]},{"propositions":[],"lastnames":["MacLean"],"firstnames":["Brendan","X."],"suffixes":[]},{"propositions":[],"lastnames":["Nesvizhskii"],"firstnames":["Alexey","I."],"suffixes":[]},{"propositions":[],"lastnames":["Pedrioli"],"firstnames":["Patrick","G.","A."],"suffixes":[]},{"propositions":[],"lastnames":["Reiter"],"firstnames":["Lukas"],"suffixes":[]},{"propositions":[],"lastnames":["Rost"],"firstnames":["Hannes","L."],"suffixes":[]},{"propositions":[],"lastnames":["Tate"],"firstnames":["Stephen"],"suffixes":[]},{"propositions":[],"lastnames":["Ting"],"firstnames":["Ying","S."],"suffixes":[]},{"propositions":[],"lastnames":["Collins"],"firstnames":["Ben","C."],"suffixes":[]},{"propositions":[],"lastnames":["Aebersold"],"firstnames":["Ruedi"],"suffixes":[]}],"month":"September","year":"2017","pmid":"28825704","pmcid":"PMC5581544","keywords":"*Data Interpretation, Statistical, Computer Simulation, High-Throughput Screening Assays/*methods, Mass Spectrometry/*methods, Models, Statistical, Peptide Mapping/*methods, Proteins/analysis/*chemistry, Reproducibility of Results, Sensitivity and Specificity, Sequence Analysis, Protein/*methods","pages":"921–927","bibtex":"@article{rosenberger_statistical_2017,\n\ttitle = {Statistical control of peptide and protein error rates in large-scale targeted data-independent acquisition analyses.},\n\tvolume = {14},\n\tissn = {1548-7105 1548-7091},\n\tdoi = {10.1038/nmeth.4398},\n\tabstract = {Liquid chromatography coupled to tandem mass spectrometry (LC-MS/MS) is the main method for high-throughput identification and quantification of peptides and inferred proteins. Within this field, data-independent acquisition (DIA) combined with peptide-centric scoring, as exemplified by the technique SWATH-MS, has emerged as a scalable method to achieve deep and consistent proteome coverage across large-scale data sets. We demonstrate that statistical concepts developed for discovery proteomics based on spectrum-centric scoring can be adapted to large-scale DIA experiments that have been analyzed with peptide-centric scoring strategies, and we provide guidance on their application. We show that optimal tradeoffs between sensitivity and specificity require careful considerations of the relationship between proteins in the samples and proteins represented in the spectral library. We propose the application of a global analyte constraint to prevent the accumulation of false positives across large-scale data sets. Furthermore, to increase the quality and reproducibility of published proteomic results, well-established confidence criteria should be reported for the detected peptide queries, peptides and inferred proteins.},\n\tlanguage = {eng},\n\tnumber = {9},\n\tjournal = {Nature methods},\n\tauthor = {Rosenberger, George and Bludau, Isabell and Schmitt, Uwe and Heusel, Moritz and Hunter, Christie L. and Liu, Yansheng and MacCoss, Michael J. and MacLean, Brendan X. and Nesvizhskii, Alexey I. and Pedrioli, Patrick G. A. and Reiter, Lukas and Rost, Hannes L. and Tate, Stephen and Ting, Ying S. and Collins, Ben C. and Aebersold, Ruedi},\n\tmonth = sep,\n\tyear = {2017},\n\tpmid = {28825704},\n\tpmcid = {PMC5581544},\n\tkeywords = {*Data Interpretation, Statistical, Computer Simulation, High-Throughput Screening Assays/*methods, Mass Spectrometry/*methods, Models, Statistical, Peptide Mapping/*methods, Proteins/analysis/*chemistry, Reproducibility of Results, Sensitivity and Specificity, Sequence Analysis, Protein/*methods},\n\tpages = {921--927}\n}\n\n","author_short":["Rosenberger, G.","Bludau, I.","Schmitt, U.","Heusel, M.","Hunter, C. L.","Liu, Y.","MacCoss, M. J.","MacLean, B. X.","Nesvizhskii, A. I.","Pedrioli, P. G. A.","Reiter, L.","Rost, H. L.","Tate, S.","Ting, Y. S.","Collins, B. C.","Aebersold, R."],"key":"rosenberger_statistical_2017","id":"rosenberger_statistical_2017","bibbaseid":"rosenberger-bludau-schmitt-heusel-hunter-liu-maccoss-maclean-etal-statisticalcontrolofpeptideandproteinerrorratesinlargescaletargeteddataindependentacquisitionanalyses-2017","role":"author","urls":{},"keyword":["*Data Interpretation","Statistical","Computer Simulation","High-Throughput Screening Assays/*methods","Mass Spectrometry/*methods","Models","Statistical","Peptide Mapping/*methods","Proteins/analysis/*chemistry","Reproducibility of Results","Sensitivity and Specificity","Sequence Analysis","Protein/*methods"],"downloads":0},"search_terms":["statistical","control","peptide","protein","error","rates","large","scale","targeted","data","independent","acquisition","analyses","rosenberger","bludau","schmitt","heusel","hunter","liu","maccoss","maclean","nesvizhskii","pedrioli","reiter","rost","tate","ting","collins","aebersold"],"keywords":["*data interpretation","statistical","computer simulation","high-throughput screening assays/*methods","mass spectrometry/*methods","models","statistical","peptide mapping/*methods","proteins/analysis/*chemistry","reproducibility of results","sensitivity and specificity","sequence analysis","protein/*methods"],"authorIDs":["54de1f9050a3f8a90a00064c"],"dataSources":["iyKKecEnSYbLzkPEN"]}