Shared Representation with Multi-omics Distributed Latent Spaces for Cancer Subtype Classification. Ryu, K. H., Park, K. H., Namsrai, O. E., Pham, V. H., & Batbaatar, E. Smart Innovation, Systems and Technologies, 212:418–425, 2021. ISBN: 9789813367562
doi  abstract   bibtex   
The integration of multi-omics data is suitable for early detection and is also significant to a wide variety of cancer detection and treatment fields. Accurate prediction of survival in cancer patients remains a challenge due to the ever-increasing heterogeneity and complexity of cancer. The latest developments in high-throughput sequencing technologies have rapidly produced multi-omics data of the same cancer sample. Recently, many studies have shown to extract biologically relevant latent features to learn the complexity of cancer by taking advantage of deep learning. In this paper, we propose a Shared representation learning method by employing the Autoencoder structure for Multi-Omics (SAMO) data, which is inspired by the recent success of variational autoencoders to extract biologically relevant features. Variational autoencoders are a deep neural network approach capable of generating meaningful latent spaces. We address the problem of losing information when integrating multiple data sources. We formulate a distributed latent space jointly learned by separated variational autoencoders on each data source in an unsupervised manner. Firstly, we pre-trained the variational autoencoders separately, which produce shared latent representations. Secondly, we fine-tuned only the encoders and latent representations with a supervised classifier for the prediction task. Here, we used a lung cancer multi-omics data combined illumina human methylation 27 K and gene expression RNA seq. datasets from The Cancer Genome Atlas (TCGA) data portal.
@article{Pham2021,
	title = {Shared {Representation} with {Multi}-omics {Distributed} {Latent} {Spaces} for {Cancer} {Subtype} {Classification}},
	volume = {212},
	issn = {21903026},
	doi = {10.1007/978-981-33-6757-9_52},
	abstract = {The integration of multi-omics data is suitable for early detection and is also significant to a wide variety of cancer detection and treatment fields. Accurate prediction of survival in cancer patients remains a challenge due to the ever-increasing heterogeneity and complexity of cancer. The latest developments in high-throughput sequencing technologies have rapidly produced multi-omics data of the same cancer sample. Recently, many studies have shown to extract biologically relevant latent features to learn the complexity of cancer by taking advantage of deep learning. In this paper, we propose a Shared representation learning method by employing the Autoencoder structure for Multi-Omics (SAMO) data, which is inspired by the recent success of variational autoencoders to extract biologically relevant features. Variational autoencoders are a deep neural network approach capable of generating meaningful latent spaces. We address the problem of losing information when integrating multiple data sources. We formulate a distributed latent space jointly learned by separated variational autoencoders on each data source in an unsupervised manner. Firstly, we pre-trained the variational autoencoders separately, which produce shared latent representations. Secondly, we fine-tuned only the encoders and latent representations with a supervised classifier for the prediction task. Here, we used a lung cancer multi-omics data combined illumina human methylation 27 K and gene expression RNA seq. datasets from The Cancer Genome Atlas (TCGA) data portal.},
	journal = {Smart Innovation, Systems and Technologies},
	author = {Ryu, Keun Ho and Park, Kwang Ho and Namsrai, Oyun Erdene and Pham, Van Huy and Batbaatar, Erdenebileg},
	year = {2021},
	note = {ISBN: 9789813367562},
	keywords = {Cancer subtype classification, Computational biology, Deep learning, Multi-omics data, Supervised learning, Variational autoencoder},
	pages = {418--425},
}

Downloads: 0