Comparison of Unplanned 30-Day Readmission Prediction Models, Based on Hospital Warehouse and Demographic Data. Dhalluin, T., Bannay, A., Lemordant, P., Sylvestre, E., Chazard, E., Cuggia, M., & Bouzille, G. Studies in Health Technology and Informatics, 270:547–551, June, 2020.
doi  abstract   bibtex   
Anticipating unplanned hospital readmission episodes is a safety and medico-economic issue. We compared statistics (Logistic Regression) and machine learning algorithms (Gradient Boosting, Random Forest, and Neural Network) for predicting the risk of all-cause, 30-day hospital readmission using data from the clinical data warehouse of Rennes and from other sources. The dataset included hospital stays based on the criteria of the French national methodology for the 30-day readmission rate (i.e., patients older than 18 years, geolocation, no iterative stays, and no hospitalization for palliative care), with a similar pre-processing for all algorithms. We calculated the area under the ROC curve (AUC) for 30-day readmission prediction by each model. In total, we included 259114 hospital stays, with a readmission rate of 8.8%. The AUC was 0.61 for the Logistic Regression, 0.69 for the Gradient Boosting, 0.69 for the Random Forest, and 0.62 for the Neural Network model. We obtained the best performance and reproducibility to predict readmissions with Random Forest, and found that the algorithms performed better when data came from different sources.
@article{dhalluin_comparison_2020,
	title = {Comparison of {Unplanned} 30-{Day} {Readmission} {Prediction} {Models}, {Based} on {Hospital} {Warehouse} and {Demographic} {Data}},
	volume = {270},
	issn = {1879-8365},
	doi = {10.3233/SHTI200220},
	abstract = {Anticipating unplanned hospital readmission episodes is a safety and medico-economic issue. We compared statistics (Logistic Regression) and machine learning algorithms (Gradient Boosting, Random Forest, and Neural Network) for predicting the risk of all-cause, 30-day hospital readmission using data from the clinical data warehouse of Rennes and from other sources. The dataset included hospital stays based on the criteria of the French national methodology for the 30-day readmission rate (i.e., patients older than 18 years, geolocation, no iterative stays, and no hospitalization for palliative care), with a similar pre-processing for all algorithms. We calculated the area under the ROC curve (AUC) for 30-day readmission prediction by each model. In total, we included 259114 hospital stays, with a readmission rate of 8.8\%. The AUC was 0.61 for the Logistic Regression, 0.69 for the Gradient Boosting, 0.69 for the Random Forest, and 0.62 for the Neural Network model. We obtained the best performance and reproducibility to predict readmissions with Random Forest, and found that the algorithms performed better when data came from different sources.},
	language = {eng},
	journal = {Studies in Health Technology and Informatics},
	author = {Dhalluin, Thibault and Bannay, Aurélie and Lemordant, Pierre and Sylvestre, Emmanuelle and Chazard, Emmanuel and Cuggia, Marc and Bouzille, Guillaume},
	month = jun,
	year = {2020},
	pmid = {32570443},
	keywords = {Data Warehousing, Medical Informatics, Patient Readmission/statistics and numerical data, Supervised Machine Learning},
	pages = {547--551},
}

Downloads: 0