Identifying longevity associated genes by integrating gene expression and curated annotations. Townes, F. W., Carr, K., & Miller, J. W. PLOS Computational Biology, 16(11):e1008429, November, 2020. Publisher: Public Library of Science
Identifying longevity associated genes by integrating gene expression and curated annotations [link]Paper  doi  abstract   bibtex   1 download  
Aging is a complex process with poorly understood genetic mechanisms. Recent studies have sought to classify genes as pro-longevity or anti-longevity using a variety of machine learning algorithms. However, it is not clear which types of features are best for optimizing classification performance and which algorithms are best suited to this task. Further, performance assessments based on held-out test data are lacking. We systematically compare five popular classification algorithms using gene ontology and gene expression datasets as features to predict the pro-longevity versus anti-longevity status of genes for two model organisms (C. elegans and S. cerevisiae) using the GenAge database as ground truth. We find that elastic net penalized logistic regression performs particularly well at this task. Using elastic net, we make novel predictions of pro- and anti-longevity genes that are not currently in the GenAge database.
@article{townes_identifying_2020,
	title = {Identifying longevity associated genes by integrating gene expression and curated annotations},
	volume = {16},
	issn = {1553-7358},
	url = {https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1008429},
	doi = {10.1371/journal.pcbi.1008429},
	abstract = {Aging is a complex process with poorly understood genetic mechanisms. Recent studies have sought to classify genes as pro-longevity or anti-longevity using a variety of machine learning algorithms. However, it is not clear which types of features are best for optimizing classification performance and which algorithms are best suited to this task. Further, performance assessments based on held-out test data are lacking. We systematically compare five popular classification algorithms using gene ontology and gene expression datasets as features to predict the pro-longevity versus anti-longevity status of genes for two model organisms (C. elegans and S. cerevisiae) using the GenAge database as ground truth. We find that elastic net penalized logistic regression performs particularly well at this task. Using elastic net, we make novel predictions of pro- and anti-longevity genes that are not currently in the GenAge database.},
	language = {en},
	number = {11},
	urldate = {2021-09-28},
	journal = {PLOS Computational Biology},
	author = {Townes, F. William and Carr, Kareem and Miller, Jeffrey W.},
	month = nov,
	year = {2020},
	note = {Publisher: Public Library of Science},
	keywords = {DNA-binding proteins, Gene expression, Gene ontologies, Gene prediction, Machine learning algorithms, Saccharomyces cerevisiae, Support vector machines, Yeast},
	pages = {e1008429},
}

Downloads: 1