Comparing genome versus proteome-based identification of clinical bacterial isolates. Galata, V., Backes, C., Laczny, C. C., Hemmrich-Stanisak, G., Li, H., Smoot, L., Posch, A. E., Schmolke, S., Bischoff, M., von Müller, L., Plum, A., Franke, A., & Keller, A. Briefings in bioinformatics, 19:495–505, May, 2018.
doi  abstract   bibtex   
Whole-genome sequencing (WGS) is gaining importance in the analysis of bacterial cultures derived from patients with infectious diseases. Existing computational tools for WGS-based identification have, however, been evaluated on previously defined data relying thereby unwarily on the available taxonomic information.Here, we newly sequenced 846 clinical gram-negative bacterial isolates representing multiple distinct genera and compared the performance of five tools (CLARK, Kaiju, Kraken, DIAMOND/MEGAN and TUIT). To establish a faithful 'gold standard', the expert-driven taxonomy was compared with identifications based on matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) mass spectrometry (MS) analysis. Additionally, the tools were also evaluated using a data set of 200 Staphylococcus aureus isolates.CLARK and Kraken (with k =31) performed best with 626 (100%) and 193 (99.5%) correct species classifications for the gram-negative and S. aureus isolates, respectively. Moreover, CLARK and Kraken demonstrated highest mean F-measure values (85.5/87.9% and 94.4/94.7% for the two data sets, respectively) in comparison with DIAMOND/MEGAN (71 and 85.3%), Kaiju (41.8 and 18.9%) and TUIT (34.5 and 86.5%). Finally, CLARK, Kaiju and Kraken outperformed the other tools by a factor of 30 to 170 fold in terms of runtime.We conclude that the application of nucleotide-based tools using k-mers-e.g. CLARK or Kraken-allows for accurate and fast taxonomic characterization of bacterial isolates from WGS data. Hence, our results suggest WGS-based genotyping to be a promising alternative to the MS-based biotyping in clinical settings. Moreover, we suggest that complementary information should be used for the evaluation of taxonomic classification tools, as public databases may suffer from suboptimal annotations.
@Article{Galata2018,
  author       = {Galata, Valentina and Backes, Christina and Laczny, Cédric Christian and Hemmrich-Stanisak, Georg and Li, Howard and Smoot, Laura and Posch, Andreas Emanuel and Schmolke, Susanne and Bischoff, Markus and von Müller, Lutz and Plum, Achim and Franke, Andre and Keller, Andreas},
  title        = {Comparing genome versus proteome-based identification of clinical bacterial isolates.},
  journal      = {Briefings in bioinformatics},
  year         = {2018},
  volume       = {19},
  pages        = {495--505},
  month        = may,
  issn         = {1477-4054},
  abstract     = {Whole-genome sequencing (WGS) is gaining importance in the analysis of bacterial cultures derived from patients with infectious diseases. Existing computational tools for WGS-based identification have, however, been evaluated on previously defined data relying thereby unwarily on the available taxonomic information.Here, we newly sequenced 846 clinical gram-negative bacterial isolates representing multiple distinct genera and compared the performance of five tools (CLARK, Kaiju, Kraken, DIAMOND/MEGAN and TUIT). To establish a faithful 'gold standard', the expert-driven taxonomy was compared with identifications based on matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) mass spectrometry (MS) analysis. Additionally, the tools were also evaluated using a data set of 200 Staphylococcus aureus isolates.CLARK and Kraken (with k =31) performed best with 626 (100%) and 193 (99.5%) correct species classifications for the gram-negative and S. aureus isolates, respectively. Moreover, CLARK and Kraken demonstrated highest mean F-measure values (85.5/87.9% and 94.4/94.7% for the two data sets, respectively) in comparison with DIAMOND/MEGAN (71 and 85.3%), Kaiju (41.8 and 18.9%) and TUIT (34.5 and 86.5%). Finally, CLARK, Kaiju and Kraken outperformed the other tools by a factor of 30 to 170 fold in terms of runtime.We conclude that the application of nucleotide-based tools using k-mers-e.g. CLARK or Kraken-allows for accurate and fast taxonomic characterization of bacterial isolates from WGS data. Hence, our results suggest WGS-based genotyping to be a promising alternative to the MS-based biotyping in clinical settings. Moreover, we suggest that complementary information should be used for the evaluation of taxonomic classification tools, as public databases may suffer from suboptimal annotations.},
  country      = {England},
  doi          = {10.1093/bib/bbw122},
  issn-linking = {1467-5463},
  issue        = {3},
  nlm-id       = {100912837},
  owner        = {NLM},
  pii          = {bbw122},
  pmid         = {28013236},
  pubmodel     = {Print},
  pubstatus    = {ppublish},
  revised      = {2018-05-17},
}

Downloads: 0