Comparing genome versus proteome-based identification of clinical bacterial isolates. Galata, V., Backes, C., Laczny, C. C., Hemmrich-Stanisak, G., Li, H., Smoot, L., Posch, A. E., Schmolke, S., Bischoff, M., von Müller, L., Plum, A., Franke, A., & Keller, A. Briefings in bioinformatics, 19:495–505, May, 2018. doi abstract bibtex Whole-genome sequencing (WGS) is gaining importance in the analysis of bacterial cultures derived from patients with infectious diseases. Existing computational tools for WGS-based identification have, however, been evaluated on previously defined data relying thereby unwarily on the available taxonomic information.Here, we newly sequenced 846 clinical gram-negative bacterial isolates representing multiple distinct genera and compared the performance of five tools (CLARK, Kaiju, Kraken, DIAMOND/MEGAN and TUIT). To establish a faithful 'gold standard', the expert-driven taxonomy was compared with identifications based on matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) mass spectrometry (MS) analysis. Additionally, the tools were also evaluated using a data set of 200 Staphylococcus aureus isolates.CLARK and Kraken (with k =31) performed best with 626 (100%) and 193 (99.5%) correct species classifications for the gram-negative and S. aureus isolates, respectively. Moreover, CLARK and Kraken demonstrated highest mean F-measure values (85.5/87.9% and 94.4/94.7% for the two data sets, respectively) in comparison with DIAMOND/MEGAN (71 and 85.3%), Kaiju (41.8 and 18.9%) and TUIT (34.5 and 86.5%). Finally, CLARK, Kaiju and Kraken outperformed the other tools by a factor of 30 to 170 fold in terms of runtime.We conclude that the application of nucleotide-based tools using k-mers-e.g. CLARK or Kraken-allows for accurate and fast taxonomic characterization of bacterial isolates from WGS data. Hence, our results suggest WGS-based genotyping to be a promising alternative to the MS-based biotyping in clinical settings. Moreover, we suggest that complementary information should be used for the evaluation of taxonomic classification tools, as public databases may suffer from suboptimal annotations.
@Article{Galata2018,
author = {Galata, Valentina and Backes, Christina and Laczny, Cédric Christian and Hemmrich-Stanisak, Georg and Li, Howard and Smoot, Laura and Posch, Andreas Emanuel and Schmolke, Susanne and Bischoff, Markus and von Müller, Lutz and Plum, Achim and Franke, Andre and Keller, Andreas},
title = {Comparing genome versus proteome-based identification of clinical bacterial isolates.},
journal = {Briefings in bioinformatics},
year = {2018},
volume = {19},
pages = {495--505},
month = may,
issn = {1477-4054},
abstract = {Whole-genome sequencing (WGS) is gaining importance in the analysis of bacterial cultures derived from patients with infectious diseases. Existing computational tools for WGS-based identification have, however, been evaluated on previously defined data relying thereby unwarily on the available taxonomic information.Here, we newly sequenced 846 clinical gram-negative bacterial isolates representing multiple distinct genera and compared the performance of five tools (CLARK, Kaiju, Kraken, DIAMOND/MEGAN and TUIT). To establish a faithful 'gold standard', the expert-driven taxonomy was compared with identifications based on matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) mass spectrometry (MS) analysis. Additionally, the tools were also evaluated using a data set of 200 Staphylococcus aureus isolates.CLARK and Kraken (with k =31) performed best with 626 (100%) and 193 (99.5%) correct species classifications for the gram-negative and S. aureus isolates, respectively. Moreover, CLARK and Kraken demonstrated highest mean F-measure values (85.5/87.9% and 94.4/94.7% for the two data sets, respectively) in comparison with DIAMOND/MEGAN (71 and 85.3%), Kaiju (41.8 and 18.9%) and TUIT (34.5 and 86.5%). Finally, CLARK, Kaiju and Kraken outperformed the other tools by a factor of 30 to 170 fold in terms of runtime.We conclude that the application of nucleotide-based tools using k-mers-e.g. CLARK or Kraken-allows for accurate and fast taxonomic characterization of bacterial isolates from WGS data. Hence, our results suggest WGS-based genotyping to be a promising alternative to the MS-based biotyping in clinical settings. Moreover, we suggest that complementary information should be used for the evaluation of taxonomic classification tools, as public databases may suffer from suboptimal annotations.},
country = {England},
doi = {10.1093/bib/bbw122},
issn-linking = {1467-5463},
issue = {3},
nlm-id = {100912837},
owner = {NLM},
pii = {bbw122},
pmid = {28013236},
pubmodel = {Print},
pubstatus = {ppublish},
revised = {2018-05-17},
}
Downloads: 0
{"_id":"NB2MxiWHk4zcsQyJd","bibbaseid":"galata-backes-laczny-hemmrichstanisak-li-smoot-posch-schmolke-etal-comparinggenomeversusproteomebasedidentificationofclinicalbacterialisolates-2018","downloads":0,"creationDate":"2018-07-24T15:09:56.641Z","title":"Comparing genome versus proteome-based identification of clinical bacterial isolates.","author_short":["Galata, V.","Backes, C.","Laczny, C. C.","Hemmrich-Stanisak, G.","Li, H.","Smoot, L.","Posch, A. E.","Schmolke, S.","Bischoff, M.","von Müller, L.","Plum, A.","Franke, A.","Keller, A."],"year":2018,"bibtype":"article","biburl":"https://www.ccb.uni-saarland.de/wp-content/uploads/2024/11/references.bib_.txt","bibdata":{"bibtype":"article","type":"article","author":[{"propositions":[],"lastnames":["Galata"],"firstnames":["Valentina"],"suffixes":[]},{"propositions":[],"lastnames":["Backes"],"firstnames":["Christina"],"suffixes":[]},{"propositions":[],"lastnames":["Laczny"],"firstnames":["Cédric","Christian"],"suffixes":[]},{"propositions":[],"lastnames":["Hemmrich-Stanisak"],"firstnames":["Georg"],"suffixes":[]},{"propositions":[],"lastnames":["Li"],"firstnames":["Howard"],"suffixes":[]},{"propositions":[],"lastnames":["Smoot"],"firstnames":["Laura"],"suffixes":[]},{"propositions":[],"lastnames":["Posch"],"firstnames":["Andreas","Emanuel"],"suffixes":[]},{"propositions":[],"lastnames":["Schmolke"],"firstnames":["Susanne"],"suffixes":[]},{"propositions":[],"lastnames":["Bischoff"],"firstnames":["Markus"],"suffixes":[]},{"propositions":["von"],"lastnames":["Müller"],"firstnames":["Lutz"],"suffixes":[]},{"propositions":[],"lastnames":["Plum"],"firstnames":["Achim"],"suffixes":[]},{"propositions":[],"lastnames":["Franke"],"firstnames":["Andre"],"suffixes":[]},{"propositions":[],"lastnames":["Keller"],"firstnames":["Andreas"],"suffixes":[]}],"title":"Comparing genome versus proteome-based identification of clinical bacterial isolates.","journal":"Briefings in bioinformatics","year":"2018","volume":"19","pages":"495–505","month":"May","issn":"1477-4054","abstract":"Whole-genome sequencing (WGS) is gaining importance in the analysis of bacterial cultures derived from patients with infectious diseases. Existing computational tools for WGS-based identification have, however, been evaluated on previously defined data relying thereby unwarily on the available taxonomic information.Here, we newly sequenced 846 clinical gram-negative bacterial isolates representing multiple distinct genera and compared the performance of five tools (CLARK, Kaiju, Kraken, DIAMOND/MEGAN and TUIT). To establish a faithful 'gold standard', the expert-driven taxonomy was compared with identifications based on matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) mass spectrometry (MS) analysis. Additionally, the tools were also evaluated using a data set of 200 Staphylococcus aureus isolates.CLARK and Kraken (with k =31) performed best with 626 (100%) and 193 (99.5%) correct species classifications for the gram-negative and S. aureus isolates, respectively. Moreover, CLARK and Kraken demonstrated highest mean F-measure values (85.5/87.9% and 94.4/94.7% for the two data sets, respectively) in comparison with DIAMOND/MEGAN (71 and 85.3%), Kaiju (41.8 and 18.9%) and TUIT (34.5 and 86.5%). Finally, CLARK, Kaiju and Kraken outperformed the other tools by a factor of 30 to 170 fold in terms of runtime.We conclude that the application of nucleotide-based tools using k-mers-e.g. CLARK or Kraken-allows for accurate and fast taxonomic characterization of bacterial isolates from WGS data. Hence, our results suggest WGS-based genotyping to be a promising alternative to the MS-based biotyping in clinical settings. Moreover, we suggest that complementary information should be used for the evaluation of taxonomic classification tools, as public databases may suffer from suboptimal annotations.","country":"England","doi":"10.1093/bib/bbw122","issn-linking":"1467-5463","issue":"3","nlm-id":"100912837","owner":"NLM","pii":"bbw122","pmid":"28013236","pubmodel":"Print","pubstatus":"ppublish","revised":"2018-05-17","bibtex":"@Article{Galata2018,\n author = {Galata, Valentina and Backes, Christina and Laczny, Cédric Christian and Hemmrich-Stanisak, Georg and Li, Howard and Smoot, Laura and Posch, Andreas Emanuel and Schmolke, Susanne and Bischoff, Markus and von Müller, Lutz and Plum, Achim and Franke, Andre and Keller, Andreas},\n title = {Comparing genome versus proteome-based identification of clinical bacterial isolates.},\n journal = {Briefings in bioinformatics},\n year = {2018},\n volume = {19},\n pages = {495--505},\n month = may,\n issn = {1477-4054},\n abstract = {Whole-genome sequencing (WGS) is gaining importance in the analysis of bacterial cultures derived from patients with infectious diseases. Existing computational tools for WGS-based identification have, however, been evaluated on previously defined data relying thereby unwarily on the available taxonomic information.Here, we newly sequenced 846 clinical gram-negative bacterial isolates representing multiple distinct genera and compared the performance of five tools (CLARK, Kaiju, Kraken, DIAMOND/MEGAN and TUIT). To establish a faithful 'gold standard', the expert-driven taxonomy was compared with identifications based on matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) mass spectrometry (MS) analysis. Additionally, the tools were also evaluated using a data set of 200 Staphylococcus aureus isolates.CLARK and Kraken (with k =31) performed best with 626 (100%) and 193 (99.5%) correct species classifications for the gram-negative and S. aureus isolates, respectively. Moreover, CLARK and Kraken demonstrated highest mean F-measure values (85.5/87.9% and 94.4/94.7% for the two data sets, respectively) in comparison with DIAMOND/MEGAN (71 and 85.3%), Kaiju (41.8 and 18.9%) and TUIT (34.5 and 86.5%). Finally, CLARK, Kaiju and Kraken outperformed the other tools by a factor of 30 to 170 fold in terms of runtime.We conclude that the application of nucleotide-based tools using k-mers-e.g. CLARK or Kraken-allows for accurate and fast taxonomic characterization of bacterial isolates from WGS data. Hence, our results suggest WGS-based genotyping to be a promising alternative to the MS-based biotyping in clinical settings. Moreover, we suggest that complementary information should be used for the evaluation of taxonomic classification tools, as public databases may suffer from suboptimal annotations.},\n country = {England},\n doi = {10.1093/bib/bbw122},\n issn-linking = {1467-5463},\n issue = {3},\n nlm-id = {100912837},\n owner = {NLM},\n pii = {bbw122},\n pmid = {28013236},\n pubmodel = {Print},\n pubstatus = {ppublish},\n revised = {2018-05-17},\n}\n\n","author_short":["Galata, V.","Backes, C.","Laczny, C. C.","Hemmrich-Stanisak, G.","Li, H.","Smoot, L.","Posch, A. E.","Schmolke, S.","Bischoff, M.","von Müller, L.","Plum, A.","Franke, A.","Keller, A."],"key":"Galata2018","id":"Galata2018","bibbaseid":"galata-backes-laczny-hemmrichstanisak-li-smoot-posch-schmolke-etal-comparinggenomeversusproteomebasedidentificationofclinicalbacterialisolates-2018","role":"author","urls":{},"metadata":{"authorlinks":{}},"downloads":0,"html":""},"search_terms":["comparing","genome","versus","proteome","based","identification","clinical","bacterial","isolates","galata","backes","laczny","hemmrich-stanisak","li","smoot","posch","schmolke","bischoff","von müller","plum","franke","keller"],"keywords":[],"authorIDs":[],"dataSources":["zcpuNJJbDQFkrSr4v","pTW7v7XACewjrTXET","BD2qbudjMvyXtTiz5","NmhXQcJvRc2QhnSZF","ipvH6pWABxuwdKDLx","Pny5E4E9kc7C8gG8g","SiGP46KPWizw6ihLJ","ZKiRa4gncFJ5e6f9M","CZZSbiMkXJgDMN2Ei","fMYw4bZ8PtmEvvgdF","XiRWyepSYzzAnCRoW","nqMohMYmMdCvacEct"]}