TransformEHR: transformer-based encoder-decoder generative model to enhance prediction of disease outcomes using electronic health records. Yang, Z., Mitra, A., Liu, W., Berlowitz, D., & Yu, H. Nature Communications, 14:7857, November, 2023.
TransformEHR: transformer-based encoder-decoder generative model to enhance prediction of disease outcomes using electronic health records [link]Paper  doi  abstract   bibtex   1 download  
Deep learning transformer-based models using longitudinal electronic health records (EHRs) have shown a great success in prediction of clinical diseases or outcomes. Pretraining on a large dataset can help such models map the input space better and boost their performance on relevant tasks through finetuning with limited data. In this study, we present TransformEHR, a generative encoder-decoder model with transformer that is pretrained using a new pretraining objective—predicting all diseases and outcomes of a patient at a future visit from previous visits. TransformEHR’s encoder-decoder framework, paired with the novel pretraining objective, helps it achieve the new state-of-the-art performance on multiple clinical prediction tasks. Comparing with the previous model, TransformEHR improves area under the precision–recall curve by 2% (p \textless 0.001) for pancreatic cancer onset and by 24% (p = 0.007) for intentional self-harm in patients with post-traumatic stress disorder. The high performance in predicting intentional self-harm shows the potential of TransformEHR in building effective clinical intervention systems. TransformEHR is also generalizable and can be easily finetuned for clinical prediction tasks with limited data., Using AI to predict disease can improve interventions slow down or prevent disease. Here, the authors show that generative AI models built on the framework of Transformer, the model that also empowers ChatGPT, can achieve state-of-the-art performance on disease predictions based on longitudinal electronic records.
@article{yang_transformehr_2023,
	title = {{TransformEHR}: transformer-based encoder-decoder generative model to enhance prediction of disease outcomes using electronic health records},
	volume = {14},
	issn = {2041-1723},
	shorttitle = {{TransformEHR}},
	url = {https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10687211/},
	doi = {10.1038/s41467-023-43715-z},
	abstract = {Deep learning transformer-based models using longitudinal electronic health records (EHRs) have shown a great success in prediction of clinical diseases or outcomes. Pretraining on a large dataset can help such models map the input space better and boost their performance on relevant tasks through finetuning with limited data. In this study, we present TransformEHR, a generative encoder-decoder model with transformer that is pretrained using a new pretraining objective—predicting all diseases and outcomes of a patient at a future visit from previous visits. TransformEHR’s encoder-decoder framework, paired with the novel pretraining objective, helps it achieve the new state-of-the-art performance on multiple clinical prediction tasks. Comparing with the previous model, TransformEHR improves area under the precision–recall curve by 2\% (p {\textless} 0.001) for pancreatic cancer onset and by 24\% (p = 0.007) for intentional self-harm in patients with post-traumatic stress disorder. The high performance in predicting intentional self-harm shows the potential of TransformEHR in building effective clinical intervention systems. TransformEHR is also generalizable and can be easily finetuned for clinical prediction tasks with limited data., Using AI to predict disease can improve interventions slow down or prevent disease. Here, the authors show that generative AI models built on the framework of Transformer, the model that also empowers ChatGPT, can achieve state-of-the-art performance on disease predictions based on longitudinal electronic records.},
	urldate = {2024-04-10},
	journal = {Nature Communications},
	author = {Yang, Zhichao and Mitra, Avijit and Liu, Weisong and Berlowitz, Dan and Yu, Hong},
	month = nov,
	year = {2023},
	pmid = {38030638},
	pmcid = {PMC10687211},
	keywords = {Computer science, Disease prevention, Experimental models of disease},
	pages = {7857},
}

Downloads: 1