Asleep at the Keyboard? Assessing the Security of GitHub Copilot's Code Contributions. Pearce, H., Ahmad, B., Tan, B., Dolan-Gavitt, B., & Karri, R. December, 2021. Number: arXiv:2108.09293 arXiv:2108.09293 [cs]
Paper abstract bibtex There is burgeoning interest in designing AI-based systems to assist humans in designing computing systems, including tools that automatically generate computer code. The most notable of these comes in the form of the first self-described `AI pair programmer', GitHub Copilot, a language model trained over open-source GitHub code. However, code often contains bugs - and so, given the vast quantity of unvetted code that Copilot has processed, it is certain that the language model will have learned from exploitable, buggy code. This raises concerns on the security of Copilot's code contributions. In this work, we systematically investigate the prevalence and conditions that can cause GitHub Copilot to recommend insecure code. To perform this analysis we prompt Copilot to generate code in scenarios relevant to high-risk CWEs (e.g. those from MITRE's "Top 25" list). We explore Copilot's performance on three distinct code generation axes – examining how it performs given diversity of weaknesses, diversity of prompts, and diversity of domains. In total, we produce 89 different scenarios for Copilot to complete, producing 1,689 programs. Of these, we found approximately 40% to be vulnerable.
@misc{pearce_asleep_2021,
title = {Asleep at the {Keyboard}? {Assessing} the {Security} of {GitHub} {Copilot}'s {Code} {Contributions}},
shorttitle = {Asleep at the {Keyboard}?},
url = {http://arxiv.org/abs/2108.09293},
abstract = {There is burgeoning interest in designing AI-based systems to assist humans in designing computing systems, including tools that automatically generate computer code. The most notable of these comes in the form of the first self-described `AI pair programmer', GitHub Copilot, a language model trained over open-source GitHub code. However, code often contains bugs - and so, given the vast quantity of unvetted code that Copilot has processed, it is certain that the language model will have learned from exploitable, buggy code. This raises concerns on the security of Copilot's code contributions. In this work, we systematically investigate the prevalence and conditions that can cause GitHub Copilot to recommend insecure code. To perform this analysis we prompt Copilot to generate code in scenarios relevant to high-risk CWEs (e.g. those from MITRE's "Top 25" list). We explore Copilot's performance on three distinct code generation axes -- examining how it performs given diversity of weaknesses, diversity of prompts, and diversity of domains. In total, we produce 89 different scenarios for Copilot to complete, producing 1,689 programs. Of these, we found approximately 40\% to be vulnerable.},
urldate = {2022-06-04},
publisher = {arXiv},
author = {Pearce, Hammond and Ahmad, Baleegh and Tan, Benjamin and Dolan-Gavitt, Brendan and Karri, Ramesh},
month = dec,
year = {2021},
note = {Number: arXiv:2108.09293
arXiv:2108.09293 [cs]},
keywords = {\#broken, Computer Science - Artificial Intelligence, Computer Science - Cryptography and Security, Jab/\#Pre},
}
Downloads: 0
{"_id":"hZcksbHg2ZJsZ9Ej9","bibbaseid":"pearce-ahmad-tan-dolangavitt-karri-asleepatthekeyboardassessingthesecurityofgithubcopilotscodecontributions-2021","author_short":["Pearce, H.","Ahmad, B.","Tan, B.","Dolan-Gavitt, B.","Karri, R."],"bibdata":{"bibtype":"misc","type":"misc","title":"Asleep at the Keyboard? Assessing the Security of GitHub Copilot's Code Contributions","shorttitle":"Asleep at the Keyboard?","url":"http://arxiv.org/abs/2108.09293","abstract":"There is burgeoning interest in designing AI-based systems to assist humans in designing computing systems, including tools that automatically generate computer code. The most notable of these comes in the form of the first self-described `AI pair programmer', GitHub Copilot, a language model trained over open-source GitHub code. However, code often contains bugs - and so, given the vast quantity of unvetted code that Copilot has processed, it is certain that the language model will have learned from exploitable, buggy code. This raises concerns on the security of Copilot's code contributions. In this work, we systematically investigate the prevalence and conditions that can cause GitHub Copilot to recommend insecure code. To perform this analysis we prompt Copilot to generate code in scenarios relevant to high-risk CWEs (e.g. those from MITRE's \"Top 25\" list). We explore Copilot's performance on three distinct code generation axes – examining how it performs given diversity of weaknesses, diversity of prompts, and diversity of domains. In total, we produce 89 different scenarios for Copilot to complete, producing 1,689 programs. Of these, we found approximately 40% to be vulnerable.","urldate":"2022-06-04","publisher":"arXiv","author":[{"propositions":[],"lastnames":["Pearce"],"firstnames":["Hammond"],"suffixes":[]},{"propositions":[],"lastnames":["Ahmad"],"firstnames":["Baleegh"],"suffixes":[]},{"propositions":[],"lastnames":["Tan"],"firstnames":["Benjamin"],"suffixes":[]},{"propositions":[],"lastnames":["Dolan-Gavitt"],"firstnames":["Brendan"],"suffixes":[]},{"propositions":[],"lastnames":["Karri"],"firstnames":["Ramesh"],"suffixes":[]}],"month":"December","year":"2021","note":"Number: arXiv:2108.09293 arXiv:2108.09293 [cs]","keywords":"#broken, Computer Science - Artificial Intelligence, Computer Science - Cryptography and Security, Jab/#Pre","bibtex":"@misc{pearce_asleep_2021,\n\ttitle = {Asleep at the {Keyboard}? {Assessing} the {Security} of {GitHub} {Copilot}'s {Code} {Contributions}},\n\tshorttitle = {Asleep at the {Keyboard}?},\n\turl = {http://arxiv.org/abs/2108.09293},\n\tabstract = {There is burgeoning interest in designing AI-based systems to assist humans in designing computing systems, including tools that automatically generate computer code. The most notable of these comes in the form of the first self-described `AI pair programmer', GitHub Copilot, a language model trained over open-source GitHub code. However, code often contains bugs - and so, given the vast quantity of unvetted code that Copilot has processed, it is certain that the language model will have learned from exploitable, buggy code. This raises concerns on the security of Copilot's code contributions. In this work, we systematically investigate the prevalence and conditions that can cause GitHub Copilot to recommend insecure code. To perform this analysis we prompt Copilot to generate code in scenarios relevant to high-risk CWEs (e.g. those from MITRE's \"Top 25\" list). We explore Copilot's performance on three distinct code generation axes -- examining how it performs given diversity of weaknesses, diversity of prompts, and diversity of domains. In total, we produce 89 different scenarios for Copilot to complete, producing 1,689 programs. Of these, we found approximately 40\\% to be vulnerable.},\n\turldate = {2022-06-04},\n\tpublisher = {arXiv},\n\tauthor = {Pearce, Hammond and Ahmad, Baleegh and Tan, Benjamin and Dolan-Gavitt, Brendan and Karri, Ramesh},\n\tmonth = dec,\n\tyear = {2021},\n\tnote = {Number: arXiv:2108.09293\narXiv:2108.09293 [cs]},\n\tkeywords = {\\#broken, Computer Science - Artificial Intelligence, Computer Science - Cryptography and Security, Jab/\\#Pre},\n}\n\n","author_short":["Pearce, H.","Ahmad, B.","Tan, B.","Dolan-Gavitt, B.","Karri, R."],"key":"pearce_asleep_2021","id":"pearce_asleep_2021","bibbaseid":"pearce-ahmad-tan-dolangavitt-karri-asleepatthekeyboardassessingthesecurityofgithubcopilotscodecontributions-2021","role":"author","urls":{"Paper":"http://arxiv.org/abs/2108.09293"},"keyword":["#broken","Computer Science - Artificial Intelligence","Computer Science - Cryptography and Security","Jab/#Pre"],"metadata":{"authorlinks":{}}},"bibtype":"misc","biburl":"https://api.zotero.org/users/4645877/collections/5QADJUWI/items?key=OGCZ3uLZZq4lLIXadnuJrB1J&format=bibtex&limit=100","dataSources":["Wsv2bQ4jPuc7qme8R","TRtmubHSqHw6999cH"],"keywords":["#broken","computer science - artificial intelligence","computer science - cryptography and security","jab/#pre"],"search_terms":["asleep","keyboard","assessing","security","github","copilot","code","contributions","pearce","ahmad","tan","dolan-gavitt","karri"],"title":"Asleep at the Keyboard? Assessing the Security of GitHub Copilot's Code Contributions","year":2021}