Competitive Fragmentation Modeling of ESI-MS/MS spectra for metabolite identification. Allen, F., Greiner, R., & Wishart, D. Technical Report 2013. arXiv:1312.0264
abstract   bibtex   
Electrospray tandem mass spectrometry (ESI-MS/MS) is commonly used in high-throughput metabolomics. However, one of the key obstacles to the effective use of this technology is the difficulty in processing measured spectra to accurately and efficiently identify metabolites. Traditional methods for automated metabolite identification compare a target MS or MS/MS spectrum to spectra in a reference database, ranking likely candidates as those with the closest match. However the limited coverage of available databases has led to interest in computational methods for predicting reference MS/MS spectra from chemical structures. This work proposes a probabilistic generative model for the MS/MS fragmentation process, which we call Competitive Fragmentation Modeling (CFM), and a machine learning approach for learning parameters for this model from data. We show that CFM can be used in both a MS/MS spectrum prediction task (ie, predicting the mass spectrum from a chemical structure), and in a metabolite identification task (ranking possible structures for a target MS/MS spectrum). In the MS/MS spectrum prediction task, this method shows significantly improved Jaccard scores when compared to a full enumeration of all peaks corresponding to substructures of the molecule. In the metabolite identification task, CFM obtains substantially better rankings for the correct candidate than existing methods MetFrag and FingerID, on a collection of 1985 tripeptides and 1491 non-peptide metabolites, querying PubChem for candidate structures of the same mass. Windows executables and cross-platform source code are freely available at http://sourceforge.net/projects/cfm-id. Supplementary files containing test molecule lists and trained models are also available on that site.
@TechReport{allen13competitive,
  author    = {Felicity Allen and Russ Greiner and David Wishart},
  title     = {{Competitive Fragmentation Modeling} of {ESI-MS/MS} spectra for metabolite identification},
  year      = {2013},
  type      = {Preprint},
  note      = {arXiv:1312.0264},
  abstract  = {Electrospray tandem mass spectrometry (ESI-MS/MS) is commonly used in high-throughput metabolomics. However, one of the key obstacles to the effective use of this technology is the difficulty in processing measured spectra to accurately and efficiently identify metabolites. Traditional methods for automated metabolite identification compare a target MS or MS/MS spectrum to spectra in a reference database, ranking likely candidates as those with the closest match. However the limited coverage of available databases has led to interest in computational methods for predicting reference MS/MS spectra from chemical structures. This work proposes a probabilistic generative model for the MS/MS fragmentation process, which we call Competitive Fragmentation Modeling (CFM), and a machine learning approach for learning parameters for this model from data. We show that CFM can be used in both a MS/MS spectrum prediction task (ie, predicting the mass spectrum from a chemical structure), and in a metabolite identification task (ranking possible structures for a target MS/MS spectrum). In the MS/MS spectrum prediction task, this method shows significantly improved Jaccard scores when compared to a full enumeration of all peaks corresponding to substructures of the molecule. In the metabolite identification task, CFM obtains substantially better rankings for the correct candidate than existing methods MetFrag and FingerID, on a collection of 1985 tripeptides and 1491 non-peptide metabolites, querying PubChem for candidate structures of the same mass. Windows executables and cross-platform source code are freely available at http://sourceforge.net/projects/cfm-id. Supplementary files containing test molecule lists and trained models are also available on that site.},
  keywords  = {MS; metabolite identification; CFM; CFM-ID;},
  owner     = {fhufsky},
  timestamp = {2014.01.13},
}

Downloads: 0