GPT-NER: Named Entity Recognition via Large Language Models

GPT-NER: Named Entity Recognition via Large Language Models. Wang, S., Sun, X., Li, X., Ouyang, R., Wu, F., Zhang, T., Li, J., & Wang, G. October, 2023. arXiv:2304.10428 [cs]

Paper doi abstract bibtex

Despite the fact that large-scale Language Models (LLM) have achieved SOTA performances on a variety of NLP tasks, its performance on NER is still significantly below supervised baselines. This is due to the gap between the two tasks the NER and LLMs: the former is a sequence labeling task in nature while the latter is a text-generation model. In this paper, we propose GPT-NER to resolve this issue. GPT-NER bridges the gap by transforming the sequence labeling task to a generation task that can be easily adapted by LLMs e.g., the task of finding location entities in the input text "Columbus is a city" is transformed to generate the text sequence "@@Columbus## is a city", where special tokens @@## marks the entity to extract. To efficiently address the "hallucination" issue of LLMs, where LLMs have a strong inclination to over-confidently label NULL inputs as entities, we propose a self-verification strategy by prompting LLMs to ask itself whether the extracted entities belong to a labeled entity tag. We conduct experiments on five widely adopted NER datasets, and GPT-NER achieves comparable performances to fully supervised baselines, which is the first time as far as we are concerned. More importantly, we find that GPT-NER exhibits a greater ability in the low-resource and few-shot setups, when the amount of training data is extremely scarce, GPT-NER performs significantly better than supervised models. This demonstrates the capabilities of GPT-NER in real-world NER applications where the number of labeled examples is limited.

@misc{wang_gpt-ner_2023,
	title = {{GPT}-{NER}: {Named} {Entity} {Recognition} via {Large} {Language} {Models}},
	shorttitle = {{GPT}-{NER}},
	url = {http://arxiv.org/abs/2304.10428},
	doi = {10.48550/arXiv.2304.10428},
	abstract = {Despite the fact that large-scale Language Models (LLM) have achieved SOTA performances on a variety of NLP tasks, its performance on NER is still significantly below supervised baselines. This is due to the gap between the two tasks the NER and LLMs: the former is a sequence labeling task in nature while the latter is a text-generation model. In this paper, we propose GPT-NER to resolve this issue. GPT-NER bridges the gap by transforming the sequence labeling task to a generation task that can be easily adapted by LLMs e.g., the task of finding location entities in the input text "Columbus is a city" is transformed to generate the text sequence "@@Columbus\#\# is a city", where special tokens @@\#\# marks the entity to extract. To efficiently address the "hallucination" issue of LLMs, where LLMs have a strong inclination to over-confidently label NULL inputs as entities, we propose a self-verification strategy by prompting LLMs to ask itself whether the extracted entities belong to a labeled entity tag. We conduct experiments on five widely adopted NER datasets, and GPT-NER achieves comparable performances to fully supervised baselines, which is the first time as far as we are concerned. More importantly, we find that GPT-NER exhibits a greater ability in the low-resource and few-shot setups, when the amount of training data is extremely scarce, GPT-NER performs significantly better than supervised models. This demonstrates the capabilities of GPT-NER in real-world NER applications where the number of labeled examples is limited.},
	urldate = {2024-06-21},
	publisher = {arXiv},
	author = {Wang, Shuhe and Sun, Xiaofei and Li, Xiaoya and Ouyang, Rongbin and Wu, Fei and Zhang, Tianwei and Li, Jiwei and Wang, Guoyin},
	month = oct,
	year = {2023},
	note = {arXiv:2304.10428 [cs]},
	keywords = {Computer Science - Computation and Language},
}

Downloads: 0

{"_id":"4eSDerCNgBcnQBp6z","bibbaseid":"wang-sun-li-ouyang-wu-zhang-li-wang-gptnernamedentityrecognitionvialargelanguagemodels-2023","author_short":["Wang, S.","Sun, X.","Li, X.","Ouyang, R.","Wu, F.","Zhang, T.","Li, J.","Wang, G."],"bibdata":{"bibtype":"misc","type":"misc","title":"GPT-NER: Named Entity Recognition via Large Language Models","shorttitle":"GPT-NER","url":"http://arxiv.org/abs/2304.10428","doi":"10.48550/arXiv.2304.10428","abstract":"Despite the fact that large-scale Language Models (LLM) have achieved SOTA performances on a variety of NLP tasks, its performance on NER is still significantly below supervised baselines. This is due to the gap between the two tasks the NER and LLMs: the former is a sequence labeling task in nature while the latter is a text-generation model. In this paper, we propose GPT-NER to resolve this issue. GPT-NER bridges the gap by transforming the sequence labeling task to a generation task that can be easily adapted by LLMs e.g., the task of finding location entities in the input text \"Columbus is a city\" is transformed to generate the text sequence \"@@Columbus## is a city\", where special tokens @@## marks the entity to extract. To efficiently address the \"hallucination\" issue of LLMs, where LLMs have a strong inclination to over-confidently label NULL inputs as entities, we propose a self-verification strategy by prompting LLMs to ask itself whether the extracted entities belong to a labeled entity tag. We conduct experiments on five widely adopted NER datasets, and GPT-NER achieves comparable performances to fully supervised baselines, which is the first time as far as we are concerned. More importantly, we find that GPT-NER exhibits a greater ability in the low-resource and few-shot setups, when the amount of training data is extremely scarce, GPT-NER performs significantly better than supervised models. This demonstrates the capabilities of GPT-NER in real-world NER applications where the number of labeled examples is limited.","urldate":"2024-06-21","publisher":"arXiv","author":[{"propositions":[],"lastnames":["Wang"],"firstnames":["Shuhe"],"suffixes":[]},{"propositions":[],"lastnames":["Sun"],"firstnames":["Xiaofei"],"suffixes":[]},{"propositions":[],"lastnames":["Li"],"firstnames":["Xiaoya"],"suffixes":[]},{"propositions":[],"lastnames":["Ouyang"],"firstnames":["Rongbin"],"suffixes":[]},{"propositions":[],"lastnames":["Wu"],"firstnames":["Fei"],"suffixes":[]},{"propositions":[],"lastnames":["Zhang"],"firstnames":["Tianwei"],"suffixes":[]},{"propositions":[],"lastnames":["Li"],"firstnames":["Jiwei"],"suffixes":[]},{"propositions":[],"lastnames":["Wang"],"firstnames":["Guoyin"],"suffixes":[]}],"month":"October","year":"2023","note":"arXiv:2304.10428 [cs]","keywords":"Computer Science - Computation and Language","bibtex":"@misc{wang_gpt-ner_2023,\n\ttitle = {{GPT}-{NER}: {Named} {Entity} {Recognition} via {Large} {Language} {Models}},\n\tshorttitle = {{GPT}-{NER}},\n\turl = {http://arxiv.org/abs/2304.10428},\n\tdoi = {10.48550/arXiv.2304.10428},\n\tabstract = {Despite the fact that large-scale Language Models (LLM) have achieved SOTA performances on a variety of NLP tasks, its performance on NER is still significantly below supervised baselines. This is due to the gap between the two tasks the NER and LLMs: the former is a sequence labeling task in nature while the latter is a text-generation model. In this paper, we propose GPT-NER to resolve this issue. GPT-NER bridges the gap by transforming the sequence labeling task to a generation task that can be easily adapted by LLMs e.g., the task of finding location entities in the input text \"Columbus is a city\" is transformed to generate the text sequence \"@@Columbus\\#\\# is a city\", where special tokens @@\\#\\# marks the entity to extract. To efficiently address the \"hallucination\" issue of LLMs, where LLMs have a strong inclination to over-confidently label NULL inputs as entities, we propose a self-verification strategy by prompting LLMs to ask itself whether the extracted entities belong to a labeled entity tag. We conduct experiments on five widely adopted NER datasets, and GPT-NER achieves comparable performances to fully supervised baselines, which is the first time as far as we are concerned. More importantly, we find that GPT-NER exhibits a greater ability in the low-resource and few-shot setups, when the amount of training data is extremely scarce, GPT-NER performs significantly better than supervised models. This demonstrates the capabilities of GPT-NER in real-world NER applications where the number of labeled examples is limited.},\n\turldate = {2024-06-21},\n\tpublisher = {arXiv},\n\tauthor = {Wang, Shuhe and Sun, Xiaofei and Li, Xiaoya and Ouyang, Rongbin and Wu, Fei and Zhang, Tianwei and Li, Jiwei and Wang, Guoyin},\n\tmonth = oct,\n\tyear = {2023},\n\tnote = {arXiv:2304.10428 [cs]},\n\tkeywords = {Computer Science - Computation and Language},\n}\n\n\n\n","author_short":["Wang, S.","Sun, X.","Li, X.","Ouyang, R.","Wu, F.","Zhang, T.","Li, J.","Wang, G."],"key":"wang_gpt-ner_2023","id":"wang_gpt-ner_2023","bibbaseid":"wang-sun-li-ouyang-wu-zhang-li-wang-gptnernamedentityrecognitionvialargelanguagemodels-2023","role":"author","urls":{"Paper":"http://arxiv.org/abs/2304.10428"},"keyword":["Computer Science - Computation and Language"],"metadata":{"authorlinks":{}}},"bibtype":"misc","biburl":"https://bibbase.org/zotero-group/schulzkx/5158478","dataSources":["JFDnASMkoQCjjGL8E"],"keywords":["computer science - computation and language"],"search_terms":["gpt","ner","named","entity","recognition","via","large","language","models","wang","sun","li","ouyang","wu","zhang","li","wang"],"title":"GPT-NER: Named Entity Recognition via Large Language Models","year":2023}