On the Dangers of Stochastic Parrots. Bender, E. M., Gebru, T., McMillan-Major, A., & Shmitchell, S. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, volume 47, of FAccT '21, pages 610–623, New York, NY, USA, 2021. Association for Computing Machinery. 🏷️ /unread
Paper doi abstract bibtex The past 3 years of work in NLP have been characterized by the development and deployment of ever larger language models, especially for English. BERT, its variants, GPT-2/3, and others, most recently Switch-C, have pushed the boundaries of the possible both through architectural innovations and through sheer size. Using these pretrained models and the methodology of fine-tuning them for specific tasks, researchers have extended the state of the art on a wide array of tasks as measured by leaderboards on specific benchmarks for English. In this paper, we take a step back and ask: How big is too big? What are the possible risks associated with this technology and what paths are available for mitigating those risks? We provide recommendations including weighing the environmental and financial costs first, investing resources into curating and carefully documenting datasets rather than ingesting everything on the web, carrying out pre-development exercises evaluating how the planned approach fits into research and development goals and supports stakeholder values, and encouraging research directions beyond ever larger language models. 【摘要翻译】过去三年来,NLP 工作的特点是开发和部署了越来越多的语言模型,尤其是英语模型。BERT 及其变体、GPT-2/3 和其他模型(最近的 Switch-C)通过架构创新和纯粹的规模突破了可能的极限。利用这些预训练模型和针对特定任务对其进行微调的方法,研究人员在一系列任务中扩展了技术水平,这可以通过特定英语基准的排行榜来衡量。在本文中,我们退后一步问:多大才算大?这项技术可能存在哪些风险,有哪些途径可以降低这些风险?我们提出的建议包括:首先权衡环境和财务成本;投入资源整理和仔细记录数据集,而不是摄取网络上的所有内容;开展前期开发工作,评估计划中的方法如何与研发目标相匹配并支持利益相关者的价值观;鼓励研究方向超越越来越大的语言模型。
@inproceedings{bender2021,
address = {New York, NY, USA},
series = {{FAccT} '21},
title = {On the {Dangers} of {Stochastic} {Parrots}},
volume = {47},
isbn = {978-1-4503-8309-7},
shorttitle = {论随机鹦鹉的危险:语言模型会太大吗?},
url = {https://doi.org/10.1145/3442188.3445922},
doi = {10.1145/3442188.3445922},
abstract = {The past 3 years of work in NLP have been characterized by the development and deployment of ever larger language models, especially for English. BERT, its variants, GPT-2/3, and others, most recently Switch-C, have pushed the boundaries of the possible both through architectural innovations and through sheer size. Using these pretrained models and the methodology of fine-tuning them for specific tasks, researchers have extended the state of the art on a wide array of tasks as measured by leaderboards on specific benchmarks for English. In this paper, we take a step back and ask: How big is too big? What are the possible risks associated with this technology and what paths are available for mitigating those risks? We provide recommendations including weighing the environmental and financial costs first, investing resources into curating and carefully documenting datasets rather than ingesting everything on the web, carrying out pre-development exercises evaluating how the planned approach fits into research and development goals and supports stakeholder values, and encouraging research directions beyond ever larger language models.
【摘要翻译】过去三年来,NLP 工作的特点是开发和部署了越来越多的语言模型,尤其是英语模型。BERT 及其变体、GPT-2/3 和其他模型(最近的 Switch-C)通过架构创新和纯粹的规模突破了可能的极限。利用这些预训练模型和针对特定任务对其进行微调的方法,研究人员在一系列任务中扩展了技术水平,这可以通过特定英语基准的排行榜来衡量。在本文中,我们退后一步问:多大才算大?这项技术可能存在哪些风险,有哪些途径可以降低这些风险?我们提出的建议包括:首先权衡环境和财务成本;投入资源整理和仔细记录数据集,而不是摄取网络上的所有内容;开展前期开发工作,评估计划中的方法如何与研发目标相匹配并支持利益相关者的价值观;鼓励研究方向超越越来越大的语言模型。},
language = {en},
urldate = {2022-01-24},
booktitle = {Proceedings of the 2021 {ACM} {Conference} on {Fairness}, {Accountability}, and {Transparency}},
publisher = {Association for Computing Machinery},
author = {Bender, Emily M. and Gebru, Timnit and McMillan-Major, Angelina and Shmitchell, Shmargaret},
year = {2021},
note = {🏷️ /unread},
keywords = {/unread},
pages = {610--623},
}
Downloads: 0
{"_id":"wBJdJuDeHAo5bhHZa","bibbaseid":"bender-gebru-mcmillanmajor-shmitchell-onthedangersofstochasticparrots-2021","author_short":["Bender, E. M.","Gebru, T.","McMillan-Major, A.","Shmitchell, S."],"bibdata":{"bibtype":"inproceedings","type":"inproceedings","address":"New York, NY, USA","series":"FAccT '21","title":"On the Dangers of Stochastic Parrots","volume":"47","isbn":"978-1-4503-8309-7","shorttitle":"论随机鹦鹉的危险:语言模型会太大吗?","url":"https://doi.org/10.1145/3442188.3445922","doi":"10.1145/3442188.3445922","abstract":"The past 3 years of work in NLP have been characterized by the development and deployment of ever larger language models, especially for English. BERT, its variants, GPT-2/3, and others, most recently Switch-C, have pushed the boundaries of the possible both through architectural innovations and through sheer size. Using these pretrained models and the methodology of fine-tuning them for specific tasks, researchers have extended the state of the art on a wide array of tasks as measured by leaderboards on specific benchmarks for English. In this paper, we take a step back and ask: How big is too big? What are the possible risks associated with this technology and what paths are available for mitigating those risks? We provide recommendations including weighing the environmental and financial costs first, investing resources into curating and carefully documenting datasets rather than ingesting everything on the web, carrying out pre-development exercises evaluating how the planned approach fits into research and development goals and supports stakeholder values, and encouraging research directions beyond ever larger language models. 【摘要翻译】过去三年来,NLP 工作的特点是开发和部署了越来越多的语言模型,尤其是英语模型。BERT 及其变体、GPT-2/3 和其他模型(最近的 Switch-C)通过架构创新和纯粹的规模突破了可能的极限。利用这些预训练模型和针对特定任务对其进行微调的方法,研究人员在一系列任务中扩展了技术水平,这可以通过特定英语基准的排行榜来衡量。在本文中,我们退后一步问:多大才算大?这项技术可能存在哪些风险,有哪些途径可以降低这些风险?我们提出的建议包括:首先权衡环境和财务成本;投入资源整理和仔细记录数据集,而不是摄取网络上的所有内容;开展前期开发工作,评估计划中的方法如何与研发目标相匹配并支持利益相关者的价值观;鼓励研究方向超越越来越大的语言模型。","language":"en","urldate":"2022-01-24","booktitle":"Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency","publisher":"Association for Computing Machinery","author":[{"propositions":[],"lastnames":["Bender"],"firstnames":["Emily","M."],"suffixes":[]},{"propositions":[],"lastnames":["Gebru"],"firstnames":["Timnit"],"suffixes":[]},{"propositions":[],"lastnames":["McMillan-Major"],"firstnames":["Angelina"],"suffixes":[]},{"propositions":[],"lastnames":["Shmitchell"],"firstnames":["Shmargaret"],"suffixes":[]}],"year":"2021","note":"🏷️ /unread","keywords":"/unread","pages":"610–623","bibtex":"@inproceedings{bender2021,\n\taddress = {New York, NY, USA},\n\tseries = {{FAccT} '21},\n\ttitle = {On the {Dangers} of {Stochastic} {Parrots}},\n\tvolume = {47},\n\tisbn = {978-1-4503-8309-7},\n\tshorttitle = {论随机鹦鹉的危险:语言模型会太大吗?},\n\turl = {https://doi.org/10.1145/3442188.3445922},\n\tdoi = {10.1145/3442188.3445922},\n\tabstract = {The past 3 years of work in NLP have been characterized by the development and deployment of ever larger language models, especially for English. BERT, its variants, GPT-2/3, and others, most recently Switch-C, have pushed the boundaries of the possible both through architectural innovations and through sheer size. Using these pretrained models and the methodology of fine-tuning them for specific tasks, researchers have extended the state of the art on a wide array of tasks as measured by leaderboards on specific benchmarks for English. In this paper, we take a step back and ask: How big is too big? What are the possible risks associated with this technology and what paths are available for mitigating those risks? We provide recommendations including weighing the environmental and financial costs first, investing resources into curating and carefully documenting datasets rather than ingesting everything on the web, carrying out pre-development exercises evaluating how the planned approach fits into research and development goals and supports stakeholder values, and encouraging research directions beyond ever larger language models.\n\n【摘要翻译】过去三年来,NLP 工作的特点是开发和部署了越来越多的语言模型,尤其是英语模型。BERT 及其变体、GPT-2/3 和其他模型(最近的 Switch-C)通过架构创新和纯粹的规模突破了可能的极限。利用这些预训练模型和针对特定任务对其进行微调的方法,研究人员在一系列任务中扩展了技术水平,这可以通过特定英语基准的排行榜来衡量。在本文中,我们退后一步问:多大才算大?这项技术可能存在哪些风险,有哪些途径可以降低这些风险?我们提出的建议包括:首先权衡环境和财务成本;投入资源整理和仔细记录数据集,而不是摄取网络上的所有内容;开展前期开发工作,评估计划中的方法如何与研发目标相匹配并支持利益相关者的价值观;鼓励研究方向超越越来越大的语言模型。},\n\tlanguage = {en},\n\turldate = {2022-01-24},\n\tbooktitle = {Proceedings of the 2021 {ACM} {Conference} on {Fairness}, {Accountability}, and {Transparency}},\n\tpublisher = {Association for Computing Machinery},\n\tauthor = {Bender, Emily M. and Gebru, Timnit and McMillan-Major, Angelina and Shmitchell, Shmargaret},\n\tyear = {2021},\n\tnote = {🏷️ /unread},\n\tkeywords = {/unread},\n\tpages = {610--623},\n}\n\n","author_short":["Bender, E. M.","Gebru, T.","McMillan-Major, A.","Shmitchell, S."],"key":"bender2021","id":"bender2021","bibbaseid":"bender-gebru-mcmillanmajor-shmitchell-onthedangersofstochasticparrots-2021","role":"author","urls":{"Paper":"https://doi.org/10.1145/3442188.3445922"},"keyword":["/unread"],"metadata":{"authorlinks":{}}},"bibtype":"inproceedings","biburl":"https://api.zotero.org/groups/2386895/collections/7PPRTB2H/items?format=bibtex&limit=100","dataSources":["u8q5uny4m5jJL9RcX"],"keywords":["/unread"],"search_terms":["dangers","stochastic","parrots","bender","gebru","mcmillan-major","shmitchell"],"title":"On the Dangers of Stochastic Parrots","year":2021}