MUSSEL: Enhanced Bayesian Polygenic Risk Prediction Leveraging Information across Multiple Ancestry Groups. Jin, J., Zhan, J., Zhang, J., Zhao, R., O'Connell, J., Jiang, Y., 23andMe Research Team, Buyske, S., Gignoux, C., Haiman, C., Kenny, E. E, Kooperberg, C., North, K., Koelsch, B. L, Wojcik, G., Zhang, H., & Chatterjee, N. bioRxiv, Sep, 2023.
MUSSEL: Enhanced Bayesian Polygenic Risk Prediction Leveraging Information across Multiple Ancestry Groups [link]Paper  doi  abstract   bibtex   
Polygenic risk scores (PRS) are now showing promising predictive performance on a wide variety of complex traits and diseases, but there exists a substantial performance gap across different populations. We propose MUSSEL, a method for ancestry-specific polygenic prediction that borrows information in the summary statistics from genome-wide association studies (GWAS) across multiple ancestry groups. MUSSEL conducts Bayesian hierarchical modeling under a MUltivariate Spike-and-Slab model for effect-size distribution and incorporates an Ensemble Learning step using super learner to combine information across different tuning parameter settings and ancestry groups. In our simulation studies and data analyses of 16 traits across four distinct studies, totaling 5.7 million participants with a substantial ancestral diversity, MUSSEL shows promising performance compared to alternatives. The method, for example, has an average gain in prediction R2 across 11 continuous traits of 40.2% and 49.3% compared to PRS-CSx and CT-SLEB, respectively, in the African Ancestry population. The best-performing method, however, varies by GWAS sample size, target ancestry, underlying trait architecture, and the choice of reference samples for LD estimation, and thus ultimately, a combination of methods may be needed to generate the most robust PRS across diverse populations.
@article{Jin:2023aa,
	abstract = {Polygenic risk scores (PRS) are now showing promising predictive performance on a wide variety of complex traits and diseases, but there exists a substantial performance gap across different populations. We propose MUSSEL, a method for ancestry-specific polygenic prediction that borrows information in the summary statistics from genome-wide association studies (GWAS) across multiple ancestry groups. MUSSEL conducts Bayesian hierarchical modeling under a MUltivariate Spike-and-Slab model for effect-size distribution and incorporates an Ensemble Learning step using super learner to combine information across different tuning parameter settings and ancestry groups. In our simulation studies and data analyses of 16 traits across four distinct studies, totaling 5.7 million participants with a substantial ancestral diversity, MUSSEL shows promising performance compared to alternatives. The method, for example, has an average gain in prediction R2 across 11 continuous traits of 40.2% and 49.3% compared to PRS-CSx and CT-SLEB, respectively, in the African Ancestry population. The best-performing method, however, varies by GWAS sample size, target ancestry, underlying trait architecture, and the choice of reference samples for LD estimation, and thus ultimately, a combination of methods may be needed to generate the most robust PRS across diverse populations.},
	author = {Jin, Jin and Zhan, Jianan and Zhang, Jingning and Zhao, Ruzhang and O'Connell, Jared and Jiang, Yunxuan and {23andMe Research Team} and Buyske, Steven and Gignoux, Christopher and Haiman, Christopher and Kenny, Eimear E and Kooperberg, Charles and North, Kari and Koelsch, Bertram L and Wojcik, Genevieve and Zhang, Haoyu and Chatterjee, Nilanjan},
	date-added = {2024-05-19 20:57:54 -0400},
	date-modified = {2024-05-19 20:57:54 -0400},
	doi = {10.1101/2023.04.12.536510},
	journal = {bioRxiv},
	journal-full = {bioRxiv : the preprint server for biology},
	keywords = {Bayesian hierarchical modeling; Effect-size distribution; Ensemble learning; Genome-wide association studies; Multi-ancestry polygenic prediction; Polygenic architecture},
	month = {Sep},
	pmc = {PMC10120638},
	pmid = {37090648},
	url = {https://pubmed.ncbi.nlm.nih.gov/37090648/},
	pst = {epublish},
	title = {MUSSEL: Enhanced Bayesian Polygenic Risk Prediction Leveraging Information across Multiple Ancestry Groups},
	year = {2023},
	bdsk-url-1 = {https://doi.org/10.1101/2023.04.12.536510}}

Downloads: 0