Microbiome Datasets Are Compositional: And This Is Not Optional. Gloor, G. B., Macklaim, J. M., Pawlowsky-Glahn, V., & Egozcue, J. J. Frontiers in Microbiology, 8:2224, 2017. doi abstract bibtex Datasets collected by high-throughput sequencing (HTS) of 16S rRNA gene amplimers, metagenomes or metatranscriptomes are commonplace and being used to study human disease states, ecological differences between sites, and the built environment. There is increasing awareness that microbiome datasets generated by HTS are compositional because they have an arbitrary total imposed by the instrument. However, many investigators are either unaware of this or assume specific properties of the compositional data. The purpose of this review is to alert investigators to the dangers inherent in ignoring the compositional nature of the data, and point out that HTS datasets derived from microbiome studies can and should be treated as compositions at all stages of analysis. We briefly introduce compositional data, illustrate the pathologies that occur when compositional data are analyzed inappropriately, and finally give guidance and point to resources and examples for the analysis of microbiome datasets using compositional data analysis.
@article{gloor_microbiome_2017,
title = {Microbiome {Datasets} {Are} {Compositional}: {And} {This} {Is} {Not} {Optional}},
volume = {8},
issn = {1664-302X},
shorttitle = {Microbiome {Datasets} {Are} {Compositional}},
doi = {10.3389/fmicb.2017.02224},
abstract = {Datasets collected by high-throughput sequencing (HTS) of 16S rRNA gene amplimers, metagenomes or metatranscriptomes are commonplace and being used to study human disease states, ecological differences between sites, and the built environment. There is increasing awareness that microbiome datasets generated by HTS are compositional because they have an arbitrary total imposed by the instrument. However, many investigators are either unaware of this or assume specific properties of the compositional data. The purpose of this review is to alert investigators to the dangers inherent in ignoring the compositional nature of the data, and point out that HTS datasets derived from microbiome studies can and should be treated as compositions at all stages of analysis. We briefly introduce compositional data, illustrate the pathologies that occur when compositional data are analyzed inappropriately, and finally give guidance and point to resources and examples for the analysis of microbiome datasets using compositional data analysis.},
language = {eng},
journal = {Frontiers in Microbiology},
author = {Gloor, Gregory B. and Macklaim, Jean M. and Pawlowsky-Glahn, Vera and Egozcue, Juan J.},
year = {2017},
pmid = {29187837},
pmcid = {PMC5695134},
keywords = {Bayesian estimation, compositional data, correlation, count normalization, high-throughput sequencing, microbiota, relative abundance},
pages = {2224},
}
Downloads: 0
{"_id":"sSdAXEmRmKLYfza9M","bibbaseid":"gloor-macklaim-pawlowskyglahn-egozcue-microbiomedatasetsarecompositionalandthisisnotoptional-2017","author_short":["Gloor, G. B.","Macklaim, J. M.","Pawlowsky-Glahn, V.","Egozcue, J. J."],"bibdata":{"bibtype":"article","type":"article","title":"Microbiome Datasets Are Compositional: And This Is Not Optional","volume":"8","issn":"1664-302X","shorttitle":"Microbiome Datasets Are Compositional","doi":"10.3389/fmicb.2017.02224","abstract":"Datasets collected by high-throughput sequencing (HTS) of 16S rRNA gene amplimers, metagenomes or metatranscriptomes are commonplace and being used to study human disease states, ecological differences between sites, and the built environment. There is increasing awareness that microbiome datasets generated by HTS are compositional because they have an arbitrary total imposed by the instrument. However, many investigators are either unaware of this or assume specific properties of the compositional data. The purpose of this review is to alert investigators to the dangers inherent in ignoring the compositional nature of the data, and point out that HTS datasets derived from microbiome studies can and should be treated as compositions at all stages of analysis. We briefly introduce compositional data, illustrate the pathologies that occur when compositional data are analyzed inappropriately, and finally give guidance and point to resources and examples for the analysis of microbiome datasets using compositional data analysis.","language":"eng","journal":"Frontiers in Microbiology","author":[{"propositions":[],"lastnames":["Gloor"],"firstnames":["Gregory","B."],"suffixes":[]},{"propositions":[],"lastnames":["Macklaim"],"firstnames":["Jean","M."],"suffixes":[]},{"propositions":[],"lastnames":["Pawlowsky-Glahn"],"firstnames":["Vera"],"suffixes":[]},{"propositions":[],"lastnames":["Egozcue"],"firstnames":["Juan","J."],"suffixes":[]}],"year":"2017","pmid":"29187837","pmcid":"PMC5695134","keywords":"Bayesian estimation, compositional data, correlation, count normalization, high-throughput sequencing, microbiota, relative abundance","pages":"2224","bibtex":"@article{gloor_microbiome_2017,\n\ttitle = {Microbiome {Datasets} {Are} {Compositional}: {And} {This} {Is} {Not} {Optional}},\n\tvolume = {8},\n\tissn = {1664-302X},\n\tshorttitle = {Microbiome {Datasets} {Are} {Compositional}},\n\tdoi = {10.3389/fmicb.2017.02224},\n\tabstract = {Datasets collected by high-throughput sequencing (HTS) of 16S rRNA gene amplimers, metagenomes or metatranscriptomes are commonplace and being used to study human disease states, ecological differences between sites, and the built environment. There is increasing awareness that microbiome datasets generated by HTS are compositional because they have an arbitrary total imposed by the instrument. However, many investigators are either unaware of this or assume specific properties of the compositional data. The purpose of this review is to alert investigators to the dangers inherent in ignoring the compositional nature of the data, and point out that HTS datasets derived from microbiome studies can and should be treated as compositions at all stages of analysis. We briefly introduce compositional data, illustrate the pathologies that occur when compositional data are analyzed inappropriately, and finally give guidance and point to resources and examples for the analysis of microbiome datasets using compositional data analysis.},\n\tlanguage = {eng},\n\tjournal = {Frontiers in Microbiology},\n\tauthor = {Gloor, Gregory B. and Macklaim, Jean M. and Pawlowsky-Glahn, Vera and Egozcue, Juan J.},\n\tyear = {2017},\n\tpmid = {29187837},\n\tpmcid = {PMC5695134},\n\tkeywords = {Bayesian estimation, compositional data, correlation, count normalization, high-throughput sequencing, microbiota, relative abundance},\n\tpages = {2224},\n}\n\n\n\n","author_short":["Gloor, G. B.","Macklaim, J. M.","Pawlowsky-Glahn, V.","Egozcue, J. J."],"key":"gloor_microbiome_2017-1","id":"gloor_microbiome_2017-1","bibbaseid":"gloor-macklaim-pawlowskyglahn-egozcue-microbiomedatasetsarecompositionalandthisisnotoptional-2017","role":"author","urls":{},"keyword":["Bayesian estimation","compositional data","correlation","count normalization","high-throughput sequencing","microbiota","relative abundance"],"metadata":{"authorlinks":{}},"downloads":0,"html":""},"bibtype":"article","biburl":"https://bibbase.org/zotero/jayanth-5566","dataSources":["Jdt3BP2cNPrxPeZi2"],"keywords":["bayesian estimation","compositional data","correlation","count normalization","high-throughput sequencing","microbiota","relative abundance"],"search_terms":["microbiome","datasets","compositional","optional","gloor","macklaim","pawlowsky-glahn","egozcue"],"title":"Microbiome Datasets Are Compositional: And This Is Not Optional","year":2017}