BioInstruct: Instruction Tuning of Large Language Models for Biomedical Natural Language Processing

BioInstruct: Instruction Tuning of Large Language Models for Biomedical Natural Language Processing. Tran, H., Yang, Z., Yao, Z., & Yu, H. November, 2023. Number: arXiv:2310.19975 arXiv:2310.19975 [cs]

Paper doi abstract bibtex 1 download

To enhance the performance of large language models (LLMs) in biomedical natural language processing (BioNLP) by introducing a domain-specific instruction dataset and examining its impact when combined with multi-task learning principles. We created the BioInstruct, comprising 25,005 instructions to instruction-tune LLMs(LLaMA 1 & 2, 7B & 13B version). The instructions were created by prompting the GPT-4 language model with three-seed samples randomly drawn from an 80 human curated instructions. We employed Low-Rank Adaptation(LoRA) for parameter-efficient fine-tuning. We then evaluated these instruction-tuned LLMs on several BioNLP tasks, which can be grouped into three major categories: question answering(QA), information extraction(IE), and text generation(GEN). We also examined whether categories(e.g., QA, IE, and generation) of instructions impact model performance. Comparing with LLMs without instruction-tuned, our instruction-tuned LLMs demonstrated marked performance gains: 17.3% in QA, 5.7% in IE, and 96% in Generation tasks. Our 7B-parameter instruction-tuned LLaMA 1 model was competitive or even surpassed other LLMs in the biomedical domain that were also fine-tuned from LLaMA 1 with vast domain-specific data or a variety of tasks. Our results also show that the performance gain is significantly higher when instruction fine-tuning is conducted with closely related tasks. Our findings align with the observations of multi-task learning, suggesting the synergies between two tasks. The BioInstruct dataset serves as a valuable resource and instruction tuned LLMs lead to the best performing BioNLP applications.

@misc{tran_bioinstruct_2023,
	title = {{BioInstruct}: {Instruction} {Tuning} of {Large} {Language} {Models} for {Biomedical} {Natural} {Language} {Processing}},
	shorttitle = {{BioInstruct}},
	url = {http://arxiv.org/abs/2310.19975},
	doi = {10.48550/arXiv.2310.19975},
	abstract = {To enhance the performance of large language models (LLMs) in biomedical natural language processing (BioNLP) by introducing a domain-specific instruction dataset and examining its impact when combined with multi-task learning principles. We created the BioInstruct, comprising 25,005 instructions to instruction-tune LLMs(LLaMA 1 \& 2, 7B \& 13B version). The instructions were created by prompting the GPT-4 language model with three-seed samples randomly drawn from an 80 human curated instructions. We employed Low-Rank Adaptation(LoRA) for parameter-efficient fine-tuning. We then evaluated these instruction-tuned LLMs on several BioNLP tasks, which can be grouped into three major categories: question answering(QA), information extraction(IE), and text generation(GEN). We also examined whether categories(e.g., QA, IE, and generation) of instructions impact model performance. Comparing with LLMs without instruction-tuned, our instruction-tuned LLMs demonstrated marked performance gains: 17.3\% in QA, 5.7\% in IE, and 96\% in Generation tasks. Our 7B-parameter instruction-tuned LLaMA 1 model was competitive or even surpassed other LLMs in the biomedical domain that were also fine-tuned from LLaMA 1 with vast domain-specific data or a variety of tasks. Our results also show that the performance gain is significantly higher when instruction fine-tuning is conducted with closely related tasks. Our findings align with the observations of multi-task learning, suggesting the synergies between two tasks. The BioInstruct dataset serves as a valuable resource and instruction tuned LLMs lead to the best performing BioNLP applications.},
	urldate = {2023-11-14},
	publisher = {arXiv},
	author = {Tran, Hieu and Yang, Zhichao and Yao, Zonghai and Yu, Hong},
	month = nov,
	year = {2023},
	note = {Number: arXiv:2310.19975
arXiv:2310.19975 [cs]},
	keywords = {Computer Science - Artificial Intelligence, Computer Science - Computation and Language},
}

Downloads: 1

{"_id":"inwPXCeELYevNAqwg","bibbaseid":"tran-yang-yao-yu-bioinstructinstructiontuningoflargelanguagemodelsforbiomedicalnaturallanguageprocessing-2023","author_short":["Tran, H.","Yang, Z.","Yao, Z.","Yu, H."],"bibdata":{"bibtype":"misc","type":"misc","title":"BioInstruct: Instruction Tuning of Large Language Models for Biomedical Natural Language Processing","shorttitle":"BioInstruct","url":"http://arxiv.org/abs/2310.19975","doi":"10.48550/arXiv.2310.19975","abstract":"To enhance the performance of large language models (LLMs) in biomedical natural language processing (BioNLP) by introducing a domain-specific instruction dataset and examining its impact when combined with multi-task learning principles. We created the BioInstruct, comprising 25,005 instructions to instruction-tune LLMs(LLaMA 1 & 2, 7B & 13B version). The instructions were created by prompting the GPT-4 language model with three-seed samples randomly drawn from an 80 human curated instructions. We employed Low-Rank Adaptation(LoRA) for parameter-efficient fine-tuning. We then evaluated these instruction-tuned LLMs on several BioNLP tasks, which can be grouped into three major categories: question answering(QA), information extraction(IE), and text generation(GEN). We also examined whether categories(e.g., QA, IE, and generation) of instructions impact model performance. Comparing with LLMs without instruction-tuned, our instruction-tuned LLMs demonstrated marked performance gains: 17.3% in QA, 5.7% in IE, and 96% in Generation tasks. Our 7B-parameter instruction-tuned LLaMA 1 model was competitive or even surpassed other LLMs in the biomedical domain that were also fine-tuned from LLaMA 1 with vast domain-specific data or a variety of tasks. Our results also show that the performance gain is significantly higher when instruction fine-tuning is conducted with closely related tasks. Our findings align with the observations of multi-task learning, suggesting the synergies between two tasks. The BioInstruct dataset serves as a valuable resource and instruction tuned LLMs lead to the best performing BioNLP applications.","urldate":"2023-11-14","publisher":"arXiv","author":[{"propositions":[],"lastnames":["Tran"],"firstnames":["Hieu"],"suffixes":[]},{"propositions":[],"lastnames":["Yang"],"firstnames":["Zhichao"],"suffixes":[]},{"propositions":[],"lastnames":["Yao"],"firstnames":["Zonghai"],"suffixes":[]},{"propositions":[],"lastnames":["Yu"],"firstnames":["Hong"],"suffixes":[]}],"month":"November","year":"2023","note":"Number: arXiv:2310.19975 arXiv:2310.19975 [cs]","keywords":"Computer Science - Artificial Intelligence, Computer Science - Computation and Language","bibtex":"@misc{tran_bioinstruct_2023,\n\ttitle = {{BioInstruct}: {Instruction} {Tuning} of {Large} {Language} {Models} for {Biomedical} {Natural} {Language} {Processing}},\n\tshorttitle = {{BioInstruct}},\n\turl = {http://arxiv.org/abs/2310.19975},\n\tdoi = {10.48550/arXiv.2310.19975},\n\tabstract = {To enhance the performance of large language models (LLMs) in biomedical natural language processing (BioNLP) by introducing a domain-specific instruction dataset and examining its impact when combined with multi-task learning principles. We created the BioInstruct, comprising 25,005 instructions to instruction-tune LLMs(LLaMA 1 \\& 2, 7B \\& 13B version). The instructions were created by prompting the GPT-4 language model with three-seed samples randomly drawn from an 80 human curated instructions. We employed Low-Rank Adaptation(LoRA) for parameter-efficient fine-tuning. We then evaluated these instruction-tuned LLMs on several BioNLP tasks, which can be grouped into three major categories: question answering(QA), information extraction(IE), and text generation(GEN). We also examined whether categories(e.g., QA, IE, and generation) of instructions impact model performance. Comparing with LLMs without instruction-tuned, our instruction-tuned LLMs demonstrated marked performance gains: 17.3\\% in QA, 5.7\\% in IE, and 96\\% in Generation tasks. Our 7B-parameter instruction-tuned LLaMA 1 model was competitive or even surpassed other LLMs in the biomedical domain that were also fine-tuned from LLaMA 1 with vast domain-specific data or a variety of tasks. Our results also show that the performance gain is significantly higher when instruction fine-tuning is conducted with closely related tasks. Our findings align with the observations of multi-task learning, suggesting the synergies between two tasks. The BioInstruct dataset serves as a valuable resource and instruction tuned LLMs lead to the best performing BioNLP applications.},\n\turldate = {2023-11-14},\n\tpublisher = {arXiv},\n\tauthor = {Tran, Hieu and Yang, Zhichao and Yao, Zonghai and Yu, Hong},\n\tmonth = nov,\n\tyear = {2023},\n\tnote = {Number: arXiv:2310.19975\narXiv:2310.19975 [cs]},\n\tkeywords = {Computer Science - Artificial Intelligence, Computer Science - Computation and Language},\n}\n\n","author_short":["Tran, H.","Yang, Z.","Yao, Z.","Yu, H."],"key":"tran_bioinstruct_2023","id":"tran_bioinstruct_2023","bibbaseid":"tran-yang-yao-yu-bioinstructinstructiontuningoflargelanguagemodelsforbiomedicalnaturallanguageprocessing-2023","role":"author","urls":{"Paper":"http://arxiv.org/abs/2310.19975"},"keyword":["Computer Science - Artificial Intelligence","Computer Science - Computation and Language"],"metadata":{"authorlinks":{}},"downloads":1,"html":""},"bibtype":"misc","biburl":"http://fenway.cs.uml.edu/papers/pubs-all.bib","dataSources":["TqaA9miSB65nRfS5H"],"keywords":["computer science - artificial intelligence","computer science - computation and language"],"search_terms":["bioinstruct","instruction","tuning","large","language","models","biomedical","natural","language","processing","tran","yang","yao","yu"],"title":"BioInstruct: Instruction Tuning of Large Language Models for Biomedical Natural Language Processing","year":2023,"downloads":1}