A Comprehensive Survey of Hallucination Mitigation Techniques in Large Language Models

A Comprehensive Survey of Hallucination Mitigation Techniques in Large Language Models. Tonmoy, S. M. T. I., Zaman, S. M. M., Jain, V., Rani, A., Rawte, V., Chadha, A., & Das, A. January, 2024. arXiv:2401.01313 [cs]

Paper abstract bibtex

As Large Language Models (LLMs) continue to advance in their ability to write human-like text, a key challenge remains around their tendency to “hallucinate” – generating content that appears factual but is ungrounded. This issue of hallucination is arguably the biggest hindrance to safely deploying these powerful LLMs into real-world production systems that impact people’s lives. The journey toward widespread adoption of LLMs in practical settings heavily relies on addressing and mitigating hallucinations. Unlike traditional AI systems focused on limited tasks, LLMs have been exposed to vast amounts of online text data during training. While this allows them to display impressive language fluency, it also means they are capable of extrapolating information from the biases in training data, misinterpreting ambiguous prompts, or modifying the information to align superficially with the input. This becomes hugely alarming when we rely on language generation capabilities for sensitive applications, such as summarizing medical records, customer support conversations, financial analysis reports, and providing erroneous legal advice. Small errors could lead to harm, revealing the LLMs’ lack of actual comprehension despite advances in self-learning. This paper presents a comprehensive survey of over thirty-two techniques developed to mitigate hallucination in LLMs. Notable among these are RetrievalAugmented Generation (RAG) (Lewis et al., 2021), Knowledge Retrieval (Varshney et al., 2023), CoNLI (Lei et al., 2023), and CoVe (Dhuliawala et al., 2023). Furthermore, we introduce a detailed taxonomy categorizing these methods based on various parameters, such as dataset utilization, common tasks, feedback mechanisms, and retriever types. This classification helps distinguish the diverse approaches specifically designed to tackle hallucination issues in LLMs. Additionally, we analyze the challenges and limitations inherent in these ∗Work does not relate to position at Amazon.

@misc{tonmoyComprehensiveSurveyHallucination2024,
	title = {A {Comprehensive} {Survey} of {Hallucination} {Mitigation} {Techniques} in {Large} {Language} {Models}},
	url = {http://arxiv.org/abs/2401.01313},
	abstract = {As Large Language Models (LLMs) continue to advance in their ability to write human-like text, a key challenge remains around their tendency to “hallucinate” – generating content that appears factual but is ungrounded. This issue of hallucination is arguably the biggest hindrance to safely deploying these powerful LLMs into real-world production systems that impact people’s lives. The journey toward widespread adoption of LLMs in practical settings heavily relies on addressing and mitigating hallucinations. Unlike traditional AI systems focused on limited tasks, LLMs have been exposed to vast amounts of online text data during training. While this allows them to display impressive language fluency, it also means they are capable of extrapolating information from the biases in training data, misinterpreting ambiguous prompts, or modifying the information to align superficially with the input. This becomes hugely alarming when we rely on language generation capabilities for sensitive applications, such as summarizing medical records, customer support conversations, financial analysis reports, and providing erroneous legal advice. Small errors could lead to harm, revealing the LLMs’ lack of actual comprehension despite advances in self-learning. This paper presents a comprehensive survey of over thirty-two techniques developed to mitigate hallucination in LLMs. Notable among these are RetrievalAugmented Generation (RAG) (Lewis et al., 2021), Knowledge Retrieval (Varshney et al., 2023), CoNLI (Lei et al., 2023), and CoVe (Dhuliawala et al., 2023). Furthermore, we introduce a detailed taxonomy categorizing these methods based on various parameters, such as dataset utilization, common tasks, feedback mechanisms, and retriever types. This classification helps distinguish the diverse approaches specifically designed to tackle hallucination issues in LLMs. Additionally, we analyze the challenges and limitations inherent in these ∗Work does not relate to position at Amazon.},
	language = {en},
	urldate = {2024-01-10},
	publisher = {arXiv},
	author = {Tonmoy, S. M. Towhidul Islam and Zaman, S. M. Mehedi and Jain, Vinija and Rani, Anku and Rawte, Vipula and Chadha, Aman and Das, Amitava},
	month = jan,
	year = {2024},
	note = {arXiv:2401.01313 [cs]},
	keywords = {Computer Science - Computation and Language},
}

Downloads: 0

{"_id":"SoRm9c3tBAoheerBL","bibbaseid":"tonmoy-zaman-jain-rani-rawte-chadha-das-acomprehensivesurveyofhallucinationmitigationtechniquesinlargelanguagemodels-2024","author_short":["Tonmoy, S. M. T. I.","Zaman, S. M. M.","Jain, V.","Rani, A.","Rawte, V.","Chadha, A.","Das, A."],"bibdata":{"bibtype":"misc","type":"misc","title":"A Comprehensive Survey of Hallucination Mitigation Techniques in Large Language Models","url":"http://arxiv.org/abs/2401.01313","abstract":"As Large Language Models (LLMs) continue to advance in their ability to write human-like text, a key challenge remains around their tendency to “hallucinate” – generating content that appears factual but is ungrounded. This issue of hallucination is arguably the biggest hindrance to safely deploying these powerful LLMs into real-world production systems that impact people’s lives. The journey toward widespread adoption of LLMs in practical settings heavily relies on addressing and mitigating hallucinations. Unlike traditional AI systems focused on limited tasks, LLMs have been exposed to vast amounts of online text data during training. While this allows them to display impressive language fluency, it also means they are capable of extrapolating information from the biases in training data, misinterpreting ambiguous prompts, or modifying the information to align superficially with the input. This becomes hugely alarming when we rely on language generation capabilities for sensitive applications, such as summarizing medical records, customer support conversations, financial analysis reports, and providing erroneous legal advice. Small errors could lead to harm, revealing the LLMs’ lack of actual comprehension despite advances in self-learning. This paper presents a comprehensive survey of over thirty-two techniques developed to mitigate hallucination in LLMs. Notable among these are RetrievalAugmented Generation (RAG) (Lewis et al., 2021), Knowledge Retrieval (Varshney et al., 2023), CoNLI (Lei et al., 2023), and CoVe (Dhuliawala et al., 2023). Furthermore, we introduce a detailed taxonomy categorizing these methods based on various parameters, such as dataset utilization, common tasks, feedback mechanisms, and retriever types. This classification helps distinguish the diverse approaches specifically designed to tackle hallucination issues in LLMs. Additionally, we analyze the challenges and limitations inherent in these ∗Work does not relate to position at Amazon.","language":"en","urldate":"2024-01-10","publisher":"arXiv","author":[{"propositions":[],"lastnames":["Tonmoy"],"firstnames":["S.","M.","Towhidul","Islam"],"suffixes":[]},{"propositions":[],"lastnames":["Zaman"],"firstnames":["S.","M.","Mehedi"],"suffixes":[]},{"propositions":[],"lastnames":["Jain"],"firstnames":["Vinija"],"suffixes":[]},{"propositions":[],"lastnames":["Rani"],"firstnames":["Anku"],"suffixes":[]},{"propositions":[],"lastnames":["Rawte"],"firstnames":["Vipula"],"suffixes":[]},{"propositions":[],"lastnames":["Chadha"],"firstnames":["Aman"],"suffixes":[]},{"propositions":[],"lastnames":["Das"],"firstnames":["Amitava"],"suffixes":[]}],"month":"January","year":"2024","note":"arXiv:2401.01313 [cs]","keywords":"Computer Science - Computation and Language","bibtex":"@misc{tonmoyComprehensiveSurveyHallucination2024,\n\ttitle = {A {Comprehensive} {Survey} of {Hallucination} {Mitigation} {Techniques} in {Large} {Language} {Models}},\n\turl = {http://arxiv.org/abs/2401.01313},\n\tabstract = {As Large Language Models (LLMs) continue to advance in their ability to write human-like text, a key challenge remains around their tendency to “hallucinate” – generating content that appears factual but is ungrounded. This issue of hallucination is arguably the biggest hindrance to safely deploying these powerful LLMs into real-world production systems that impact people’s lives. The journey toward widespread adoption of LLMs in practical settings heavily relies on addressing and mitigating hallucinations. Unlike traditional AI systems focused on limited tasks, LLMs have been exposed to vast amounts of online text data during training. While this allows them to display impressive language fluency, it also means they are capable of extrapolating information from the biases in training data, misinterpreting ambiguous prompts, or modifying the information to align superficially with the input. This becomes hugely alarming when we rely on language generation capabilities for sensitive applications, such as summarizing medical records, customer support conversations, financial analysis reports, and providing erroneous legal advice. Small errors could lead to harm, revealing the LLMs’ lack of actual comprehension despite advances in self-learning. This paper presents a comprehensive survey of over thirty-two techniques developed to mitigate hallucination in LLMs. Notable among these are RetrievalAugmented Generation (RAG) (Lewis et al., 2021), Knowledge Retrieval (Varshney et al., 2023), CoNLI (Lei et al., 2023), and CoVe (Dhuliawala et al., 2023). Furthermore, we introduce a detailed taxonomy categorizing these methods based on various parameters, such as dataset utilization, common tasks, feedback mechanisms, and retriever types. This classification helps distinguish the diverse approaches specifically designed to tackle hallucination issues in LLMs. Additionally, we analyze the challenges and limitations inherent in these ∗Work does not relate to position at Amazon.},\n\tlanguage = {en},\n\turldate = {2024-01-10},\n\tpublisher = {arXiv},\n\tauthor = {Tonmoy, S. M. Towhidul Islam and Zaman, S. M. Mehedi and Jain, Vinija and Rani, Anku and Rawte, Vipula and Chadha, Aman and Das, Amitava},\n\tmonth = jan,\n\tyear = {2024},\n\tnote = {arXiv:2401.01313 [cs]},\n\tkeywords = {Computer Science - Computation and Language},\n}\n\n","author_short":["Tonmoy, S. M. T. I.","Zaman, S. M. M.","Jain, V.","Rani, A.","Rawte, V.","Chadha, A.","Das, A."],"key":"tonmoyComprehensiveSurveyHallucination2024","id":"tonmoyComprehensiveSurveyHallucination2024","bibbaseid":"tonmoy-zaman-jain-rani-rawte-chadha-das-acomprehensivesurveyofhallucinationmitigationtechniquesinlargelanguagemodels-2024","role":"author","urls":{"Paper":"http://arxiv.org/abs/2401.01313"},"keyword":["Computer Science - Computation and Language"],"metadata":{"authorlinks":{}}},"bibtype":"misc","biburl":"https://bibbase.org/f/vr5ooa48xeYes5KDD/ailaw.bib","dataSources":["7FkfQdR6FwGXEAZFa","QHxajSYCsDY5s5PEr","taWdMrienBzqHC2tC"],"keywords":["computer science - computation and language"],"search_terms":["comprehensive","survey","hallucination","mitigation","techniques","large","language","models","tonmoy","zaman","jain","rani","rawte","chadha","das"],"title":"A Comprehensive Survey of Hallucination Mitigation Techniques in Large Language Models","year":2024}