Predicting the presence of uncommon elements in unknown biomolecules from isotope patterns. Meusel, M., Hufsky, F., Panter, F., Krug, D., Müller, R., & Böcker, S. Anal Chem, 88(15):7556-7566, 2016.
doi  abstract   bibtex   
Motivation: The determination of the molecular formula is one of the earliest and most important steps when investigating the chemical nature of an unknown compound. Common approaches use the isotopic pattern of a compound measured using mass spectrometry. Computational methods to determine the molecular formula from this isotopic pattern require a fixed set of elements. Considering all possible elements severely increases running times and more importantly the chance for false positive identifications as the number of candidate formulas for a given target mass rises significantly if the constituting elements are not pre-filtered. This negative effect grows stronger for compounds of higher molecular mass as the effect of a single atom on the overall isotopic pattern grows smaller. On the other hand, hand-selected restrictions on this set of elements may prevent the identification of the correct molecular formula. Thus, it is a crucial step to determine the set of elements most likely comprising the compound prior to the assignment of an elemental formula to an exact mass. Results: In this paper, we present a method to determine the presence of certain elements (sulfur, chlorine, bromine, boron and selenium) in the compound from its (high mass accuracy) isotopic pattern. We limit ourselves to biomolecules, in the sense of products from nature or synthetic products with potential bioactivity. The classifiers developed here predict the presence of an element with a very high sensitivity and high specificity. We evaluate classifiers on three real-world datasets with 663 isotope patterns in total: 184 isotope patterns containing sulfur, 187 containing chlorine, 14 containing bromine, one containing boron, one containing selenium. In no case do we make a false negative prediction; for chlorine, bromine, boron, and selenium, we make ten false positive predictions in total. We also demonstrate the impact of our method on the identification of molecular formulas, in particular the number of considered candidates and running time.
@Article{meusel16predicting,
  author    = {Marvin Meusel and Franziska Hufsky and Fabian Panter and Daniel Krug and Rolf M\"uller and Sebastian B\"ocker},
  title     = {Predicting the presence of uncommon elements in unknown biomolecules from isotope patterns},
  journal   = {Anal Chem},
  year      = {2016},
  volume    = {88},
  number    = {15},
  pages     = {7556-7566},
  abstract  = {Motivation: The determination of the molecular formula is one of the earliest and most important steps when investigating the chemical nature of an unknown compound. Common approaches use the isotopic pattern of a compound measured using mass spectrometry.  Computational methods to determine the molecular formula from this isotopic pattern require a fixed set of elements. Considering all possible elements severely increases running times and more importantly the chance for false positive identifications as the number of candidate formulas for a given target mass rises significantly if the constituting elements are not pre-filtered.  This negative effect grows stronger for compounds of higher molecular mass as the effect of a single atom on the overall isotopic pattern grows smaller.  On the other hand, hand-selected restrictions on this set of elements may prevent the identification of the correct molecular formula. Thus, it is a crucial step to determine the set of elements most likely comprising the compound prior to the assignment of an elemental formula to an exact mass.
Results: In this paper, we present a method to determine the presence of certain elements (sulfur, chlorine, bromine, boron and selenium) in the compound from its (high mass accuracy) isotopic pattern.  We limit ourselves to biomolecules, in the sense of products from nature or synthetic products with potential bioactivity.  The classifiers developed here predict the presence of an element with a very high sensitivity and high specificity. We evaluate classifiers on three real-world datasets with 663 isotope patterns in total: 184 isotope patterns containing sulfur, 187 containing chlorine, 14 containing bromine, one containing boron, one containing selenium.  In no case do we make a false negative prediction; for chlorine, bromine, boron, and selenium, we make ten false positive predictions in total.  We also demonstrate the impact of our method on the identification of molecular formulas, in particular the number of considered candidates and running time.},
  doi       = {10.1021/acs.analchem.6b01015},
  keywords  = {jena; MS; MS1; isotope pattern;},
  owner     = {Sebastian},
  pmid      = {27398867},
  timestamp = {2016.07.11},
}
Downloads: 0