Extracting Biomedical Factual Knowledge Using Pretrained Language Model and Electronic Health Record Context

Extracting Biomedical Factual Knowledge Using Pretrained Language Model and Electronic Health Record Context. Yao, Z., Cao, Y., Yang, Z., Deshpande, V., & Yu, H. AMIA Annual Symposium Proceedings, 2022:1188–1197, April, 2023.

Paper abstract bibtex

Language Models (LMs) have performed well on biomedical natural language processing applications. In this study, we conducted some experiments to use prompt methods to extract knowledge from LMs as new knowledge Bases (LMs as KBs). However, prompting can only be used as a low bound for knowledge extraction, and perform particularly poorly on biomedical domain KBs. In order to make LMs as KBs more in line with the actual application scenarios of the biomedical domain, we specifically add EHR notes as context to the prompt to improve the low bound in the biomedical domain. We design and validate a series of experiments for our Dynamic-Context-BioLAMA task. Our experiments show that the knowledge possessed by those language models can distinguish the correct knowledge from the noise knowledge in the EHR notes, and such distinguishing ability can also be used as a new metric to evaluate the amount of knowledge possessed by the model.

@article{yao_extracting_2023,
	title = {Extracting {Biomedical} {Factual} {Knowledge} {Using} {Pretrained} {Language} {Model} and {Electronic} {Health} {Record} {Context}},
	volume = {2022},
	issn = {1942-597X},
	url = {https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10148358/},
	abstract = {Language Models (LMs) have performed well on biomedical natural language processing applications. In this study, we conducted some experiments to use prompt methods to extract knowledge from LMs as new knowledge Bases (LMs as KBs). However, prompting can only be used as a low bound for knowledge extraction, and perform particularly poorly on biomedical domain KBs. In order to make LMs as KBs more in line with the actual application scenarios of the biomedical domain, we specifically add EHR notes as context to the prompt to improve the low bound in the biomedical domain. We design and validate a series of experiments for our Dynamic-Context-BioLAMA task. Our experiments show that the knowledge possessed by those language models can distinguish the correct knowledge from the noise knowledge in the EHR notes, and such distinguishing ability can also be used as a new metric to evaluate the amount of knowledge possessed by the model.},
	urldate = {2024-04-10},
	journal = {AMIA Annual Symposium Proceedings},
	author = {Yao, Zonghai and Cao, Yi and Yang, Zhichao and Deshpande, Vijeta and Yu, Hong},
	month = apr,
	year = {2023},
	pmid = {37128373},
	pmcid = {PMC10148358},
	pages = {1188--1197},
}

Downloads: 0

{"_id":"4qH9xf7un683skDvp","bibbaseid":"yao-cao-yang-deshpande-yu-extractingbiomedicalfactualknowledgeusingpretrainedlanguagemodelandelectronichealthrecordcontext-2023","author_short":["Yao, Z.","Cao, Y.","Yang, Z.","Deshpande, V.","Yu, H."],"bibdata":{"bibtype":"article","type":"article","title":"Extracting Biomedical Factual Knowledge Using Pretrained Language Model and Electronic Health Record Context","volume":"2022","issn":"1942-597X","url":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10148358/","abstract":"Language Models (LMs) have performed well on biomedical natural language processing applications. In this study, we conducted some experiments to use prompt methods to extract knowledge from LMs as new knowledge Bases (LMs as KBs). However, prompting can only be used as a low bound for knowledge extraction, and perform particularly poorly on biomedical domain KBs. In order to make LMs as KBs more in line with the actual application scenarios of the biomedical domain, we specifically add EHR notes as context to the prompt to improve the low bound in the biomedical domain. We design and validate a series of experiments for our Dynamic-Context-BioLAMA task. Our experiments show that the knowledge possessed by those language models can distinguish the correct knowledge from the noise knowledge in the EHR notes, and such distinguishing ability can also be used as a new metric to evaluate the amount of knowledge possessed by the model.","urldate":"2024-04-10","journal":"AMIA Annual Symposium Proceedings","author":[{"propositions":[],"lastnames":["Yao"],"firstnames":["Zonghai"],"suffixes":[]},{"propositions":[],"lastnames":["Cao"],"firstnames":["Yi"],"suffixes":[]},{"propositions":[],"lastnames":["Yang"],"firstnames":["Zhichao"],"suffixes":[]},{"propositions":[],"lastnames":["Deshpande"],"firstnames":["Vijeta"],"suffixes":[]},{"propositions":[],"lastnames":["Yu"],"firstnames":["Hong"],"suffixes":[]}],"month":"April","year":"2023","pmid":"37128373","pmcid":"PMC10148358","pages":"1188–1197","bibtex":"@article{yao_extracting_2023,\n\ttitle = {Extracting {Biomedical} {Factual} {Knowledge} {Using} {Pretrained} {Language} {Model} and {Electronic} {Health} {Record} {Context}},\n\tvolume = {2022},\n\tissn = {1942-597X},\n\turl = {https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10148358/},\n\tabstract = {Language Models (LMs) have performed well on biomedical natural language processing applications. In this study, we conducted some experiments to use prompt methods to extract knowledge from LMs as new knowledge Bases (LMs as KBs). However, prompting can only be used as a low bound for knowledge extraction, and perform particularly poorly on biomedical domain KBs. In order to make LMs as KBs more in line with the actual application scenarios of the biomedical domain, we specifically add EHR notes as context to the prompt to improve the low bound in the biomedical domain. We design and validate a series of experiments for our Dynamic-Context-BioLAMA task. Our experiments show that the knowledge possessed by those language models can distinguish the correct knowledge from the noise knowledge in the EHR notes, and such distinguishing ability can also be used as a new metric to evaluate the amount of knowledge possessed by the model.},\n\turldate = {2024-04-10},\n\tjournal = {AMIA Annual Symposium Proceedings},\n\tauthor = {Yao, Zonghai and Cao, Yi and Yang, Zhichao and Deshpande, Vijeta and Yu, Hong},\n\tmonth = apr,\n\tyear = {2023},\n\tpmid = {37128373},\n\tpmcid = {PMC10148358},\n\tpages = {1188--1197},\n}\n\n","author_short":["Yao, Z.","Cao, Y.","Yang, Z.","Deshpande, V.","Yu, H."],"key":"yao_extracting_2023","id":"yao_extracting_2023","bibbaseid":"yao-cao-yang-deshpande-yu-extractingbiomedicalfactualknowledgeusingpretrainedlanguagemodelandelectronichealthrecordcontext-2023","role":"author","urls":{"Paper":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10148358/"},"metadata":{"authorlinks":{}},"html":""},"bibtype":"article","biburl":"http://fenway.cs.uml.edu/papers/pubs-all.bib","dataSources":["TqaA9miSB65nRfS5H"],"keywords":[],"search_terms":["extracting","biomedical","factual","knowledge","using","pretrained","language","model","electronic","health","record","context","yao","cao","yang","deshpande","yu"],"title":"Extracting Biomedical Factual Knowledge Using Pretrained Language Model and Electronic Health Record Context","year":2023}