SE Perspective on LLMs: Biases in Code Generation, Code Interpretability, and Code Security Risks

SE Perspective on LLMs: Biases in Code Generation, Code Interpretability, and Code Security Risks. Krasniqi, R., Xu, D., & Vieira, M. ACM Computing Surveys, 2025.

Paper doi abstract bibtex 3 downloads

Large Language Models (LLMs) are transforming the world with their ability to generate diverse content, including code, but embedded biases raise significant concerns. In this perspective piece, we critique the wide-spreading view of LLMs as infallible tools by examining how biases in training data can lead to discriminatory code generation, opaque code interpretation, and heightened security risks, ultimately impacting the trustworthiness of LLM-generated software. Through a reflective analysis grounded in existing literature, including case studies and theoretical frameworks from software engineering and AI ethics, we examine the specific manifestations of bias in code generation, focusing on how training data contribute to these issues. We investigate the challenges associated with interpreting LLM-generated code, highlighting the lack of transparency and the potential for hidden biases, and explore the security risks introduced by biased LLMs, namely vulnerabilities that may be exploited by malicious actors. We provide several recommendations for mitigating these challenges, emphasizing the need to refine training data and involve humans-in-the-loop.

@article{krasniqi_se_2025,
	title = {{SE} {Perspective} on {LLMs}: {Biases} in {Code} {Generation}, {Code} {Interpretability}, and {Code} {Security} {Risks}},
	issn = {0360-0300, 1557-7341},
	shorttitle = {{SE} {Perspective} on {LLMs}},
	url = {https://dl.acm.org/doi/10.1145/3774324},
	doi = {10.1145/3774324},
	abstract = {Large Language Models (LLMs) are transforming the world with their ability to generate diverse content, including code, but embedded biases raise significant concerns. In this perspective piece, we critique the wide-spreading view of LLMs as infallible tools by examining how biases in training data can lead to discriminatory code generation, opaque code interpretation, and heightened security risks, ultimately impacting the trustworthiness of LLM-generated software. Through a reflective analysis grounded in existing literature, including case studies and theoretical frameworks from software engineering and AI ethics, we examine the specific manifestations of bias in code generation, focusing on how training data contribute to these issues. We investigate the challenges associated with interpreting LLM-generated code, highlighting the lack of transparency and the potential for hidden biases, and explore the security risks introduced by biased LLMs, namely vulnerabilities that may be exploited by malicious actors. We provide several recommendations for mitigating these challenges, emphasizing the need to refine training data and involve humans-in-the-loop.},
	language = {en},
	urldate = {2025-11-06},
	journal = {ACM Computing Surveys},
	author = {Krasniqi, Rrezarta and Xu, Depeng and Vieira, Marco},
	year = {2025},
	pages = {1--16},
}

Downloads: 3

{"_id":"HtaHveedA4RnS8dZp","bibbaseid":"krasniqi-xu-vieira-seperspectiveonllmsbiasesincodegenerationcodeinterpretabilityandcodesecurityrisks-2025","author_short":["Krasniqi, R.","Xu, D.","Vieira, M."],"bibdata":{"bibtype":"article","type":"article","title":"SE Perspective on LLMs: Biases in Code Generation, Code Interpretability, and Code Security Risks","issn":"0360-0300, 1557-7341","shorttitle":"SE Perspective on LLMs","url":"https://dl.acm.org/doi/10.1145/3774324","doi":"10.1145/3774324","abstract":"Large Language Models (LLMs) are transforming the world with their ability to generate diverse content, including code, but embedded biases raise significant concerns. In this perspective piece, we critique the wide-spreading view of LLMs as infallible tools by examining how biases in training data can lead to discriminatory code generation, opaque code interpretation, and heightened security risks, ultimately impacting the trustworthiness of LLM-generated software. Through a reflective analysis grounded in existing literature, including case studies and theoretical frameworks from software engineering and AI ethics, we examine the specific manifestations of bias in code generation, focusing on how training data contribute to these issues. We investigate the challenges associated with interpreting LLM-generated code, highlighting the lack of transparency and the potential for hidden biases, and explore the security risks introduced by biased LLMs, namely vulnerabilities that may be exploited by malicious actors. We provide several recommendations for mitigating these challenges, emphasizing the need to refine training data and involve humans-in-the-loop.","language":"en","urldate":"2025-11-06","journal":"ACM Computing Surveys","author":[{"propositions":[],"lastnames":["Krasniqi"],"firstnames":["Rrezarta"],"suffixes":[]},{"propositions":[],"lastnames":["Xu"],"firstnames":["Depeng"],"suffixes":[]},{"propositions":[],"lastnames":["Vieira"],"firstnames":["Marco"],"suffixes":[]}],"year":"2025","pages":"1–16","bibtex":"@article{krasniqi_se_2025,\n\ttitle = {{SE} {Perspective} on {LLMs}: {Biases} in {Code} {Generation}, {Code} {Interpretability}, and {Code} {Security} {Risks}},\n\tissn = {0360-0300, 1557-7341},\n\tshorttitle = {{SE} {Perspective} on {LLMs}},\n\turl = {https://dl.acm.org/doi/10.1145/3774324},\n\tdoi = {10.1145/3774324},\n\tabstract = {Large Language Models (LLMs) are transforming the world with their ability to generate diverse content, including code, but embedded biases raise significant concerns. In this perspective piece, we critique the wide-spreading view of LLMs as infallible tools by examining how biases in training data can lead to discriminatory code generation, opaque code interpretation, and heightened security risks, ultimately impacting the trustworthiness of LLM-generated software. Through a reflective analysis grounded in existing literature, including case studies and theoretical frameworks from software engineering and AI ethics, we examine the specific manifestations of bias in code generation, focusing on how training data contribute to these issues. We investigate the challenges associated with interpreting LLM-generated code, highlighting the lack of transparency and the potential for hidden biases, and explore the security risks introduced by biased LLMs, namely vulnerabilities that may be exploited by malicious actors. We provide several recommendations for mitigating these challenges, emphasizing the need to refine training data and involve humans-in-the-loop.},\n\tlanguage = {en},\n\turldate = {2025-11-06},\n\tjournal = {ACM Computing Surveys},\n\tauthor = {Krasniqi, Rrezarta and Xu, Depeng and Vieira, Marco},\n\tyear = {2025},\n\tpages = {1--16},\n}\n\n","author_short":["Krasniqi, R.","Xu, D.","Vieira, M."],"key":"krasniqi_se_2025","id":"krasniqi_se_2025","bibbaseid":"krasniqi-xu-vieira-seperspectiveonllmsbiasesincodegenerationcodeinterpretabilityandcodesecurityrisks-2025","role":"author","urls":{"Paper":"https://dl.acm.org/doi/10.1145/3774324"},"metadata":{"authorlinks":{}},"downloads":3},"bibtype":"article","biburl":"https://api.zotero.org/users/10198036/collections/2RHJXKSI/items?key=X0RoN8iO9RtTbrWfSkRasb7b&format=bibtex&limit=100","dataSources":["37aX9ioouEvzbunGp","JHDShjsHrs6ZHE4bz"],"keywords":[],"search_terms":["perspective","llms","biases","code","generation","code","interpretability","code","security","risks","krasniqi","xu","vieira"],"title":"SE Perspective on LLMs: Biases in Code Generation, Code Interpretability, and Code Security Risks","year":2025,"downloads":3}