Identifying Cell Subpopulations and Their Genetic Drivers from Single-Cell RNA-Seq Data Using a Biclustering Approach

Identifying Cell Subpopulations and Their Genetic Drivers from Single-Cell RNA-Seq Data Using a Biclustering Approach. Shi, F. & Huang, H. Journal of Computational Biology, 24(7):663–674, July, 2017.

Paper doi abstract bibtex

Single-cell RNA-Seq (scRNA-Seq) has attracted much attention recently because it allows unprecedented resolution into cellular activity; the technology, therefore, has been widely applied in studying cell heterogeneity such as the heterogeneity among embryonic cells at varied developmental stages or cells of different cancer types or subtypes. A pertinent question in such analyses is to identify cell subpopulations as well as their associated genetic drivers. Consequently, a multitude of approaches have been developed for clustering or biclustering analysis of scRNA-Seq data. In this article, we present a fast and simple iterative biclustering approach called “BiSNN-Walk” based on the existing SNN-Cliq algorithm. One of BiSNN-Walk's differentiating features is that it returns a ranked list of clusters, which may serve as an indicator of a cluster's reliability. Another important feature is that BiSNN-Walk ranks genes in a gene cluster according to their level of affiliation to the associated cell cluster, making the result more biologically interpretable. We also introduce an entropy-based measure for choosing a highly clusterable similarity matrix as our starting point among a wide selection to facilitate the efficient operation of our algorithm. We applied BiSNN-Walk to three large scRNA-Seq studies, where we demonstrated that BiSNN-Walk was able to retain and sometimes improve the cell clustering ability of SNN-Cliq. We were able to obtain biologically sensible gene clusters in terms of GO term enrichment. In addition, we saw that there was significant overlap in top characteristic genes for clusters corresponding to similar cell states, further demonstrating the fidelity of our gene clusters.

@article{shi_identifying_2017,
	title = {Identifying {Cell} {Subpopulations} and {Their} {Genetic} {Drivers} from {Single}-{Cell} {RNA}-{Seq} {Data} {Using} a {Biclustering} {Approach}},
	volume = {24},
	url = {https://www.liebertpub.com/doi/full/10.1089/cmb.2017.0049},
	doi = {10.1089/cmb.2017.0049},
	abstract = {Single-cell RNA-Seq (scRNA-Seq) has attracted much attention recently because it allows unprecedented resolution into cellular activity; the technology, therefore, has been widely applied in studying cell heterogeneity such as the heterogeneity among embryonic cells at varied developmental stages or cells of different cancer types or subtypes. A pertinent question in such analyses is to identify cell subpopulations as well as their associated genetic drivers. Consequently, a multitude of approaches have been developed for clustering or biclustering analysis of scRNA-Seq data. In this article, we present a fast and simple iterative biclustering approach called “BiSNN-Walk” based on the existing SNN-Cliq algorithm. One of BiSNN-Walk's differentiating features is that it returns a ranked list of clusters, which may serve as an indicator of a cluster's reliability. Another important feature is that BiSNN-Walk ranks genes in a gene cluster according to their level of affiliation to the associated cell cluster, making the result more biologically interpretable. We also introduce an entropy-based measure for choosing a highly clusterable similarity matrix as our starting point among a wide selection to facilitate the efficient operation of our algorithm. We applied BiSNN-Walk to three large scRNA-Seq studies, where we demonstrated that BiSNN-Walk was able to retain and sometimes improve the cell clustering ability of SNN-Cliq. We were able to obtain biologically sensible gene clusters in terms of GO term enrichment. In addition, we saw that there was significant overlap in top characteristic genes for clusters corresponding to similar cell states, further demonstrating the fidelity of our gene clusters.},
	number = {7},
	urldate = {2018-07-26TZ},
	journal = {Journal of Computational Biology},
	author = {Shi, Funan and Huang, Haiyan},
	month = jul,
	year = {2017},
	pages = {663--674}
}

Downloads: 0

{"_id":"gBiWhRd6c3ujf2fmP","bibbaseid":"shi-huang-identifyingcellsubpopulationsandtheirgeneticdriversfromsinglecellrnaseqdatausingabiclusteringapproach-2017","downloads":0,"creationDate":"2018-08-07T10:35:25.597Z","title":"Identifying Cell Subpopulations and Their Genetic Drivers from Single-Cell RNA-Seq Data Using a Biclustering Approach","author_short":["Shi, F.","Huang, H."],"year":2017,"bibtype":"article","biburl":"https://api.zotero.org/groups/2117194/items?key=6dQqramjk0G0pVu9ZtTZzbMc&format=bibtex&limit=100","bibdata":{"bibtype":"article","type":"article","title":"Identifying Cell Subpopulations and Their Genetic Drivers from Single-Cell RNA-Seq Data Using a Biclustering Approach","volume":"24","url":"https://www.liebertpub.com/doi/full/10.1089/cmb.2017.0049","doi":"10.1089/cmb.2017.0049","abstract":"Single-cell RNA-Seq (scRNA-Seq) has attracted much attention recently because it allows unprecedented resolution into cellular activity; the technology, therefore, has been widely applied in studying cell heterogeneity such as the heterogeneity among embryonic cells at varied developmental stages or cells of different cancer types or subtypes. A pertinent question in such analyses is to identify cell subpopulations as well as their associated genetic drivers. Consequently, a multitude of approaches have been developed for clustering or biclustering analysis of scRNA-Seq data. In this article, we present a fast and simple iterative biclustering approach called “BiSNN-Walk” based on the existing SNN-Cliq algorithm. One of BiSNN-Walk's differentiating features is that it returns a ranked list of clusters, which may serve as an indicator of a cluster's reliability. Another important feature is that BiSNN-Walk ranks genes in a gene cluster according to their level of affiliation to the associated cell cluster, making the result more biologically interpretable. We also introduce an entropy-based measure for choosing a highly clusterable similarity matrix as our starting point among a wide selection to facilitate the efficient operation of our algorithm. We applied BiSNN-Walk to three large scRNA-Seq studies, where we demonstrated that BiSNN-Walk was able to retain and sometimes improve the cell clustering ability of SNN-Cliq. We were able to obtain biologically sensible gene clusters in terms of GO term enrichment. In addition, we saw that there was significant overlap in top characteristic genes for clusters corresponding to similar cell states, further demonstrating the fidelity of our gene clusters.","number":"7","urldate":"2018-07-26TZ","journal":"Journal of Computational Biology","author":[{"propositions":[],"lastnames":["Shi"],"firstnames":["Funan"],"suffixes":[]},{"propositions":[],"lastnames":["Huang"],"firstnames":["Haiyan"],"suffixes":[]}],"month":"July","year":"2017","pages":"663–674","bibtex":"@article{shi_identifying_2017,\n\ttitle = {Identifying {Cell} {Subpopulations} and {Their} {Genetic} {Drivers} from {Single}-{Cell} {RNA}-{Seq} {Data} {Using} a {Biclustering} {Approach}},\n\tvolume = {24},\n\turl = {https://www.liebertpub.com/doi/full/10.1089/cmb.2017.0049},\n\tdoi = {10.1089/cmb.2017.0049},\n\tabstract = {Single-cell RNA-Seq (scRNA-Seq) has attracted much attention recently because it allows unprecedented resolution into cellular activity; the technology, therefore, has been widely applied in studying cell heterogeneity such as the heterogeneity among embryonic cells at varied developmental stages or cells of different cancer types or subtypes. A pertinent question in such analyses is to identify cell subpopulations as well as their associated genetic drivers. Consequently, a multitude of approaches have been developed for clustering or biclustering analysis of scRNA-Seq data. In this article, we present a fast and simple iterative biclustering approach called “BiSNN-Walk” based on the existing SNN-Cliq algorithm. One of BiSNN-Walk's differentiating features is that it returns a ranked list of clusters, which may serve as an indicator of a cluster's reliability. Another important feature is that BiSNN-Walk ranks genes in a gene cluster according to their level of affiliation to the associated cell cluster, making the result more biologically interpretable. We also introduce an entropy-based measure for choosing a highly clusterable similarity matrix as our starting point among a wide selection to facilitate the efficient operation of our algorithm. We applied BiSNN-Walk to three large scRNA-Seq studies, where we demonstrated that BiSNN-Walk was able to retain and sometimes improve the cell clustering ability of SNN-Cliq. We were able to obtain biologically sensible gene clusters in terms of GO term enrichment. In addition, we saw that there was significant overlap in top characteristic genes for clusters corresponding to similar cell states, further demonstrating the fidelity of our gene clusters.},\n\tnumber = {7},\n\turldate = {2018-07-26TZ},\n\tjournal = {Journal of Computational Biology},\n\tauthor = {Shi, Funan and Huang, Haiyan},\n\tmonth = jul,\n\tyear = {2017},\n\tpages = {663--674}\n}\n\n","author_short":["Shi, F.","Huang, H."],"key":"shi_identifying_2017","id":"shi_identifying_2017","bibbaseid":"shi-huang-identifyingcellsubpopulationsandtheirgeneticdriversfromsinglecellrnaseqdatausingabiclusteringapproach-2017","role":"author","urls":{"Paper":"https://www.liebertpub.com/doi/full/10.1089/cmb.2017.0049"},"downloads":0},"search_terms":["identifying","cell","subpopulations","genetic","drivers","single","cell","rna","seq","data","using","biclustering","approach","shi","huang"],"keywords":[],"authorIDs":[],"dataSources":["Lxn8GJtcwTbjQ84DA"]}