Labeling Noncontrast Head CT Reports for Common Findings Using Natural Language Processing. Iorga, M., Drakopoulos, M., Naidech, A., Katsaggelos, A., Parrish, T., & Hill, V. American Journal of Neuroradiology, 43(5):721–726, may, 2022.
Labeling Noncontrast Head CT Reports for Common Findings Using Natural Language Processing [link]Paper  doi  abstract   bibtex   
BACKGROUND AND PURPOSE: Prioritizing reading of noncontrast head CT examinations through an automated triage system may improve time to care for patients with acute neuroradiologic findings. We present a natural language-processing approach for labeling findings in noncontrast head CT reports, which permits creation of a large, labeled dataset of head CT images for development of emergent-finding detection and reading-prioritization algorithms. MATERIALS AND METHODS: In this retrospective study, 1002 clinical radiology reports from noncontrast head CTs collected between 2008 and 2013 were manually labeled across 12 common neuroradiologic finding categories. Each report was then encoded using an n-gram model of unigrams, bigrams, and trigrams. A logistic regression model was then trained to label each report for every common finding. Models were trained and assessed using a combination of L2 regularization and 5-fold cross-validation. RESULTS: Model performance was strongest for the fracture, hemorrhage, herniation, mass effect, pneumocephalus, postoperative status, and volume loss models in which the area under the receiver operating characteristic curve exceeded 0.95. Performance was relatively weaker for the edema, hydrocephalus, infarct, tumor, and white-matter disease models (area under the receiver operating characteristic curve > 0.85). Analysis of coefficients revealed finding-specific words among the top coefficients in each model. Class output probabilities were found to be a useful indicator of predictive error on individual report examples in higher-performing models. CONCLUSIONS: Combining logistic regression with n-gram encoding is a robust approach to labeling common findings in noncontrast head CT reports.
@article{Michael2022a,
abstract = {BACKGROUND AND PURPOSE: Prioritizing reading of noncontrast head CT examinations through an automated triage system may improve time to care for patients with acute neuroradiologic findings. We present a natural language-processing approach for labeling findings in noncontrast head CT reports, which permits creation of a large, labeled dataset of head CT images for development of emergent-finding detection and reading-prioritization algorithms. MATERIALS AND METHODS: In this retrospective study, 1002 clinical radiology reports from noncontrast head CTs collected between 2008 and 2013 were manually labeled across 12 common neuroradiologic finding categories. Each report was then encoded using an n-gram model of unigrams, bigrams, and trigrams. A logistic regression model was then trained to label each report for every common finding. Models were trained and assessed using a combination of L2 regularization and 5-fold cross-validation. RESULTS: Model performance was strongest for the fracture, hemorrhage, herniation, mass effect, pneumocephalus, postoperative status, and volume loss models in which the area under the receiver operating characteristic curve exceeded 0.95. Performance was relatively weaker for the edema, hydrocephalus, infarct, tumor, and white-matter disease models (area under the receiver operating characteristic curve > 0.85). Analysis of coefficients revealed finding-specific words among the top coefficients in each model. Class output probabilities were found to be a useful indicator of predictive error on individual report examples in higher-performing models. CONCLUSIONS: Combining logistic regression with n-gram encoding is a robust approach to labeling common findings in noncontrast head CT reports.},
author = {Iorga, Michael and Drakopoulos, M. and Naidech, A.M. and Katsaggelos, A.K. and Parrish, T.B. and Hill, V.B.},
doi = {10.3174/ajnr.A7500},
issn = {0195-6108},
journal = {American Journal of Neuroradiology},
month = {may},
number = {5},
pages = {721--726},
pmid = {35483905},
title = {{Labeling Noncontrast Head CT Reports for Common Findings Using Natural Language Processing}},
url = {http://www.ajnr.org/lookup/doi/10.3174/ajnr.A7500},
volume = {43},
year = {2022}
}

Downloads: 0