Automatic generation of case-detection algorithms to identify children with asthma from large electronic health record databases. Afzal, Z., Engelkes, M., Verhamme, K. M. C., Janssens, H. M., Sturkenboom, M. C. J. M., Kors, J. A., & Schuemie, M. J. Pharmacoepidemiology and drug safety, 22(8):826–833, August, 2013. Place: England
doi  abstract   bibtex   
PURPOSE: Most electronic health record databases contain unstructured free-text narratives, which cannot be easily analyzed. Case-detection algorithms are usually created manually and often rely only on using coded information such as International Classification of Diseases version 9 codes. We applied a machine-learning approach to generate and evaluate an automated case-detection algorithm that uses both free-text and coded information to identify asthma cases. METHODS: The Integrated Primary Care Information (IPCI) database was searched for potential asthma patients aged 5-18 years using a broad query on asthma-related codes, drugs, and free text. A training set of 5032 patients was created by manually annotating the potential patients as definite, probable, or doubtful asthma cases or non-asthma cases. The rule-learning program RIPPER was then used to generate algorithms to distinguish cases from non-cases. An over-sampling method was used to balance the performance of the automated algorithm to meet our study requirements. Performance of the automated algorithm was evaluated against the manually annotated set. RESULTS: The selected algorithm yielded a positive predictive value (PPV) of 0.66, sensitivity of 0.98, and specificity of 0.95 when identifying only definite asthma cases; a PPV of 0.82, sensitivity of 0.96, and specificity of 0.90 when identifying both definite and probable asthma cases; and a PPV of 0.57, sensitivity of 0.95, and specificity of 0.67 for the scenario identifying definite, probable, and doubtful asthma cases. CONCLUSIONS: The automated algorithm shows good performance in detecting cases of asthma utilizing both free-text and coded data. This algorithm will facilitate large-scale studies of asthma in the IPCI database.
@article{afzal_automatic_2013,
	title = {Automatic generation of case-detection algorithms to identify children with asthma  from large electronic health record databases.},
	volume = {22},
	copyright = {Copyright © 2013 John Wiley \& Sons, Ltd.},
	issn = {1099-1557 1053-8569},
	doi = {10.1002/pds.3438},
	abstract = {PURPOSE: Most electronic health record databases contain unstructured free-text  narratives, which cannot be easily analyzed. Case-detection algorithms are usually  created manually and often rely only on using coded information such as  International Classification of Diseases version 9 codes. We applied a  machine-learning approach to generate and evaluate an automated case-detection  algorithm that uses both free-text and coded information to identify asthma cases.  METHODS: The Integrated Primary Care Information (IPCI) database was searched for  potential asthma patients aged 5-18 years using a broad query on asthma-related  codes, drugs, and free text. A training set of 5032 patients was created by manually  annotating the potential patients as definite, probable, or doubtful asthma cases or  non-asthma cases. The rule-learning program RIPPER was then used to generate  algorithms to distinguish cases from non-cases. An over-sampling method was used to  balance the performance of the automated algorithm to meet our study requirements.  Performance of the automated algorithm was evaluated against the manually annotated  set. RESULTS: The selected algorithm yielded a positive predictive value (PPV) of  0.66, sensitivity of 0.98, and specificity of 0.95 when identifying only definite  asthma cases; a PPV of 0.82, sensitivity of 0.96, and specificity of 0.90 when  identifying both definite and probable asthma cases; and a PPV of 0.57, sensitivity  of 0.95, and specificity of 0.67 for the scenario identifying definite, probable,  and doubtful asthma cases. CONCLUSIONS: The automated algorithm shows good  performance in detecting cases of asthma utilizing both free-text and coded data.  This algorithm will facilitate large-scale studies of asthma in the IPCI database.},
	language = {eng},
	number = {8},
	journal = {Pharmacoepidemiology and drug safety},
	author = {Afzal, Zubair and Engelkes, Marjolein and Verhamme, Katia M. C. and Janssens, Hettie M. and Sturkenboom, Miriam C. J. M. and Kors, Jan A. and Schuemie, Martijn J.},
	month = aug,
	year = {2013},
	pmid = {23592573},
	note = {Place: England},
	keywords = {*Algorithms, *Electronic Health Records, Adolescent, Asthma/*epidemiology, Child, Child, Preschool, Databases, Factual, Electronic Data Processing, Humans, Predictive Value of Tests, Sensitivity and Specificity, automated case definition, case-detection algorithms, electronic medical records, machine learning, pharmacoepidemiology},
	pages = {826--833},
}

Downloads: 0