Comparison of Discrimination Methods for Peptide Classification in Tandem Mass Spectrometry

Comparison of Discrimination Methods for Peptide Classification in Tandem Mass Spectrometry. Bonner, A. & Liu, H. In Proc. of IEEE Computational Intelligence in Bioinformatics and Computational Biology (CIBCB 2004), pages 160–167, 2004.
doi abstract bibtex

Proteomics - the direct analysis of the expressed protein components of a cell - is critical to our understanding of cellular biological processes. Key insights into the action and effects of a disease can be obtained by comparison of the expression of the expressed proteins in normal versus diseased tissue. Tandem mass spectrometry (MS/MS) of peptides is a central technology for Proteomics, enabling the identification of thousands of peptides from a complex mixture. With the increasing acquisition rate of tandem mass spectrometers, there is an increasing potential to solve important biological problems by applying data-mining and machine-learning techniques to MS/MS data. These problems include (i) estimating the levels of the thousands of proteins in a tissue sample, (ii) predicting the intensity of the peaks in a mass spectrum, and (iii) explaining why different peptides from the same protein have different peak intensities. In other works, we have focussed on the first two problems. In this paper, we focus on the last problem. In particular, we try to explain why some peptides produce peaks of great intensity, while others produce peaks of low intensity, and we treat this as a classification problem. That is, we experimentally evaluate and compare a variety of discrimination methods for classifying peptides into those that produce high-intensity peaks and those that produce low-intensity peaks. The methods considered include K-nearest neighbours (KNN), logistic regression, linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), Naive Bayes, and hidden Markov models (HMMs). Experiments using these methods were conducted on three real-world datasets derived from tissue samples of Mouse. The methods were then evaluated using ROC curves and cross validation.

@InProceedings{bonner04comparison,
  author    = {Bonner, Anthony and Liu, Han},
  title     = {Comparison of Discrimination Methods for Peptide Classification in Tandem Mass Spectrometry},
  booktitle = {Proc. of IEEE Computational Intelligence in Bioinformatics and Computational Biology (CIBCB 2004)},
  year      = {2004},
  pages     = {160--167},
  abstract  = {Proteomics - the direct analysis of the expressed protein components of a cell - is critical to our understanding of cellular biological processes. Key insights into the action and effects of a disease can be obtained by comparison of the expression of the expressed proteins in normal versus diseased tissue. Tandem mass spectrometry (MS/MS) of peptides is a central technology for Proteomics, enabling the identification of thousands of peptides from a complex mixture. With the increasing acquisition rate of tandem mass spectrometers, there is an increasing potential to solve important biological problems by applying data-mining and machine-learning techniques to MS/MS data. These problems include (i) estimating the levels of the thousands of proteins in a tissue sample, (ii) predicting the intensity of the peaks in a mass spectrum, and (iii) explaining why different peptides from the same protein have different peak intensities. In other works, we have focussed on the first two problems. In this paper, we focus on the last problem. In particular, we try to explain why some peptides produce peaks of great intensity, while others produce peaks of low intensity, and we treat this as a classification problem. That is, we experimentally evaluate and compare a variety of discrimination methods for classifying peptides into those that produce high-intensity peaks and those that produce low-intensity peaks. The methods considered include K-nearest neighbours (KNN), logistic regression, linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), Naive Bayes, and hidden Markov models (HMMs). Experiments using these methods were conducted on three real-world datasets derived from tissue samples of Mouse. The methods were then evaluated using ROC curves and cross validation.},
  doi       = {10.1109/CIBCB.2004.1393949},
  file      = {BonnerLiu_ComparisonDiscriminationMethods_CIBCB_2004.pdf:2004/BonnerLiu_ComparisonDiscriminationMethods_CIBCB_2004.pdf:PDF},
  keywords  = {tandem ms},
}

Downloads: 0

{"_id":"2yiQQMwQE8G73Mgbz","bibbaseid":"bonner-liu-comparisonofdiscriminationmethodsforpeptideclassificationintandemmassspectrometry-2004","authorIDs":[],"author_short":["Bonner, A.","Liu, H."],"bibdata":{"bibtype":"inproceedings","type":"inproceedings","author":[{"propositions":[],"lastnames":["Bonner"],"firstnames":["Anthony"],"suffixes":[]},{"propositions":[],"lastnames":["Liu"],"firstnames":["Han"],"suffixes":[]}],"title":"Comparison of Discrimination Methods for Peptide Classification in Tandem Mass Spectrometry","booktitle":"Proc. of IEEE Computational Intelligence in Bioinformatics and Computational Biology (CIBCB 2004)","year":"2004","pages":"160–167","abstract":"Proteomics - the direct analysis of the expressed protein components of a cell - is critical to our understanding of cellular biological processes. Key insights into the action and effects of a disease can be obtained by comparison of the expression of the expressed proteins in normal versus diseased tissue. Tandem mass spectrometry (MS/MS) of peptides is a central technology for Proteomics, enabling the identification of thousands of peptides from a complex mixture. With the increasing acquisition rate of tandem mass spectrometers, there is an increasing potential to solve important biological problems by applying data-mining and machine-learning techniques to MS/MS data. These problems include (i) estimating the levels of the thousands of proteins in a tissue sample, (ii) predicting the intensity of the peaks in a mass spectrum, and (iii) explaining why different peptides from the same protein have different peak intensities. In other works, we have focussed on the first two problems. In this paper, we focus on the last problem. In particular, we try to explain why some peptides produce peaks of great intensity, while others produce peaks of low intensity, and we treat this as a classification problem. That is, we experimentally evaluate and compare a variety of discrimination methods for classifying peptides into those that produce high-intensity peaks and those that produce low-intensity peaks. The methods considered include K-nearest neighbours (KNN), logistic regression, linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), Naive Bayes, and hidden Markov models (HMMs). Experiments using these methods were conducted on three real-world datasets derived from tissue samples of Mouse. The methods were then evaluated using ROC curves and cross validation.","doi":"10.1109/CIBCB.2004.1393949","file":"BonnerLiu_ComparisonDiscriminationMethods_CIBCB_2004.pdf:2004/BonnerLiu_ComparisonDiscriminationMethods_CIBCB_2004.pdf:PDF","keywords":"tandem ms","bibtex":"@InProceedings{bonner04comparison,\n author = {Bonner, Anthony and Liu, Han},\n title = {Comparison of Discrimination Methods for Peptide Classification in Tandem Mass Spectrometry},\n booktitle = {Proc. of IEEE Computational Intelligence in Bioinformatics and Computational Biology (CIBCB 2004)},\n year = {2004},\n pages = {160--167},\n abstract = {Proteomics - the direct analysis of the expressed protein components of a cell - is critical to our understanding of cellular biological processes. Key insights into the action and effects of a disease can be obtained by comparison of the expression of the expressed proteins in normal versus diseased tissue. Tandem mass spectrometry (MS/MS) of peptides is a central technology for Proteomics, enabling the identification of thousands of peptides from a complex mixture. With the increasing acquisition rate of tandem mass spectrometers, there is an increasing potential to solve important biological problems by applying data-mining and machine-learning techniques to MS/MS data. These problems include (i) estimating the levels of the thousands of proteins in a tissue sample, (ii) predicting the intensity of the peaks in a mass spectrum, and (iii) explaining why different peptides from the same protein have different peak intensities. In other works, we have focussed on the first two problems. In this paper, we focus on the last problem. In particular, we try to explain why some peptides produce peaks of great intensity, while others produce peaks of low intensity, and we treat this as a classification problem. That is, we experimentally evaluate and compare a variety of discrimination methods for classifying peptides into those that produce high-intensity peaks and those that produce low-intensity peaks. The methods considered include K-nearest neighbours (KNN), logistic regression, linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), Naive Bayes, and hidden Markov models (HMMs). Experiments using these methods were conducted on three real-world datasets derived from tissue samples of Mouse. The methods were then evaluated using ROC curves and cross validation.},\n doi = {10.1109/CIBCB.2004.1393949},\n file = {BonnerLiu_ComparisonDiscriminationMethods_CIBCB_2004.pdf:2004/BonnerLiu_ComparisonDiscriminationMethods_CIBCB_2004.pdf:PDF},\n keywords = {tandem ms},\n}\n\n","author_short":["Bonner, A.","Liu, H."],"key":"bonner04comparison","id":"bonner04comparison","bibbaseid":"bonner-liu-comparisonofdiscriminationmethodsforpeptideclassificationintandemmassspectrometry-2004","role":"author","urls":{},"keyword":["tandem ms"],"metadata":{"authorlinks":{}}},"bibtype":"inproceedings","biburl":"https://git.bio.informatik.uni-jena.de/fleisch/literature/raw/master/group-literature.bib","creationDate":"2019-11-19T16:50:41.644Z","downloads":0,"keywords":["tandem ms"],"search_terms":["comparison","discrimination","methods","peptide","classification","tandem","mass","spectrometry","bonner","liu"],"title":"Comparison of Discrimination Methods for Peptide Classification in Tandem Mass Spectrometry","year":2004,"dataSources":["C5FtkvWWggFfMJTFX"]}