What's documented in AI? Systematic Analysis of 32K AI Model Cards. Liang, W., Rajani, N., Yang, X., Ozoani, E., Wu, E., Chen, Y., Smith, D. S., & Zou, J. February, 2024.  arXiv:2402.05160 [cs]![link What's documented in AI? Systematic Analysis of 32K AI Model Cards [link]](https://bibbase.org/img/filetypes/link.svg) Paper  doi  abstract   bibtex
Paper  doi  abstract   bibtex   The rapid proliferation of AI models has underscored the importance of thorough documentation, as it enables users to understand, trust, and effectively utilize these models in various applications. Although developers are encouraged to produce model cards, it's not clear how much information or what information these cards contain. In this study, we conduct a comprehensive analysis of 32,111 AI model documentations on Hugging Face, a leading platform for distributing and deploying AI models. Our investigation sheds light on the prevailing model card documentation practices. Most of the AI models with substantial downloads provide model cards, though the cards have uneven informativeness. We find that sections addressing environmental impact, limitations, and evaluation exhibit the lowest filled-out rates, while the training section is the most consistently filled-out. We analyze the content of each section to characterize practitioners' priorities. Interestingly, there are substantial discussions of data, sometimes with equal or even greater emphasis than the model itself. To evaluate the impact of model cards, we conducted an intervention study by adding detailed model cards to 42 popular models which had no or sparse model cards previously. We find that adding model cards is moderately correlated with an increase weekly download rates. Our study opens up a new perspective for analyzing community norms and practices for model documentation through large-scale data science and linguistics analysis.
@misc{liang_whats_2024,
	title = {What's documented in {AI}? {Systematic} {Analysis} of {32K} {AI} {Model} {Cards}},
	shorttitle = {What's documented in {AI}?},
	url = {http://arxiv.org/abs/2402.05160},
	doi = {10.48550/arXiv.2402.05160},
	abstract = {The rapid proliferation of AI models has underscored the importance of thorough documentation, as it enables users to understand, trust, and effectively utilize these models in various applications. Although developers are encouraged to produce model cards, it's not clear how much information or what information these cards contain. In this study, we conduct a comprehensive analysis of 32,111 AI model documentations on Hugging Face, a leading platform for distributing and deploying AI models. Our investigation sheds light on the prevailing model card documentation practices. Most of the AI models with substantial downloads provide model cards, though the cards have uneven informativeness. We find that sections addressing environmental impact, limitations, and evaluation exhibit the lowest filled-out rates, while the training section is the most consistently filled-out. We analyze the content of each section to characterize practitioners' priorities. Interestingly, there are substantial discussions of data, sometimes with equal or even greater emphasis than the model itself. To evaluate the impact of model cards, we conducted an intervention study by adding detailed model cards to 42 popular models which had no or sparse model cards previously. We find that adding model cards is moderately correlated with an increase weekly download rates. Our study opens up a new perspective for analyzing community norms and practices for model documentation through large-scale data science and linguistics analysis.},
	urldate = {2024-09-03},
	publisher = {arXiv},
	author = {Liang, Weixin and Rajani, Nazneen and Yang, Xinyu and Ozoani, Ezinwanne and Wu, Eric and Chen, Yiqun and Smith, Daniel Scott and Zou, James},
	month = feb,
	year = {2024},
	note = {arXiv:2402.05160 [cs]},
	keywords = {Computer Science - Artificial Intelligence, Computer Science - Machine Learning, Computer Science - Software Engineering},
} 
Downloads: 0
{"_id":"ADkzN6zsNcoXFS6h9","bibbaseid":"liang-rajani-yang-ozoani-wu-chen-smith-zou-whatsdocumentedinaisystematicanalysisof32kaimodelcards-2024","author_short":["Liang, W.","Rajani, N.","Yang, X.","Ozoani, E.","Wu, E.","Chen, Y.","Smith, D. S.","Zou, J."],"bibdata":{"bibtype":"misc","type":"misc","title":"What's documented in AI? Systematic Analysis of 32K AI Model Cards","shorttitle":"What's documented in AI?","url":"http://arxiv.org/abs/2402.05160","doi":"10.48550/arXiv.2402.05160","abstract":"The rapid proliferation of AI models has underscored the importance of thorough documentation, as it enables users to understand, trust, and effectively utilize these models in various applications. Although developers are encouraged to produce model cards, it's not clear how much information or what information these cards contain. In this study, we conduct a comprehensive analysis of 32,111 AI model documentations on Hugging Face, a leading platform for distributing and deploying AI models. Our investigation sheds light on the prevailing model card documentation practices. Most of the AI models with substantial downloads provide model cards, though the cards have uneven informativeness. We find that sections addressing environmental impact, limitations, and evaluation exhibit the lowest filled-out rates, while the training section is the most consistently filled-out. We analyze the content of each section to characterize practitioners' priorities. Interestingly, there are substantial discussions of data, sometimes with equal or even greater emphasis than the model itself. To evaluate the impact of model cards, we conducted an intervention study by adding detailed model cards to 42 popular models which had no or sparse model cards previously. We find that adding model cards is moderately correlated with an increase weekly download rates. Our study opens up a new perspective for analyzing community norms and practices for model documentation through large-scale data science and linguistics analysis.","urldate":"2024-09-03","publisher":"arXiv","author":[{"propositions":[],"lastnames":["Liang"],"firstnames":["Weixin"],"suffixes":[]},{"propositions":[],"lastnames":["Rajani"],"firstnames":["Nazneen"],"suffixes":[]},{"propositions":[],"lastnames":["Yang"],"firstnames":["Xinyu"],"suffixes":[]},{"propositions":[],"lastnames":["Ozoani"],"firstnames":["Ezinwanne"],"suffixes":[]},{"propositions":[],"lastnames":["Wu"],"firstnames":["Eric"],"suffixes":[]},{"propositions":[],"lastnames":["Chen"],"firstnames":["Yiqun"],"suffixes":[]},{"propositions":[],"lastnames":["Smith"],"firstnames":["Daniel","Scott"],"suffixes":[]},{"propositions":[],"lastnames":["Zou"],"firstnames":["James"],"suffixes":[]}],"month":"February","year":"2024","note":"arXiv:2402.05160 [cs]","keywords":"Computer Science - Artificial Intelligence, Computer Science - Machine Learning, Computer Science - Software Engineering","bibtex":"@misc{liang_whats_2024,\n\ttitle = {What's documented in {AI}? {Systematic} {Analysis} of {32K} {AI} {Model} {Cards}},\n\tshorttitle = {What's documented in {AI}?},\n\turl = {http://arxiv.org/abs/2402.05160},\n\tdoi = {10.48550/arXiv.2402.05160},\n\tabstract = {The rapid proliferation of AI models has underscored the importance of thorough documentation, as it enables users to understand, trust, and effectively utilize these models in various applications. Although developers are encouraged to produce model cards, it's not clear how much information or what information these cards contain. In this study, we conduct a comprehensive analysis of 32,111 AI model documentations on Hugging Face, a leading platform for distributing and deploying AI models. Our investigation sheds light on the prevailing model card documentation practices. Most of the AI models with substantial downloads provide model cards, though the cards have uneven informativeness. We find that sections addressing environmental impact, limitations, and evaluation exhibit the lowest filled-out rates, while the training section is the most consistently filled-out. We analyze the content of each section to characterize practitioners' priorities. Interestingly, there are substantial discussions of data, sometimes with equal or even greater emphasis than the model itself. To evaluate the impact of model cards, we conducted an intervention study by adding detailed model cards to 42 popular models which had no or sparse model cards previously. We find that adding model cards is moderately correlated with an increase weekly download rates. Our study opens up a new perspective for analyzing community norms and practices for model documentation through large-scale data science and linguistics analysis.},\n\turldate = {2024-09-03},\n\tpublisher = {arXiv},\n\tauthor = {Liang, Weixin and Rajani, Nazneen and Yang, Xinyu and Ozoani, Ezinwanne and Wu, Eric and Chen, Yiqun and Smith, Daniel Scott and Zou, James},\n\tmonth = feb,\n\tyear = {2024},\n\tnote = {arXiv:2402.05160 [cs]},\n\tkeywords = {Computer Science - Artificial Intelligence, Computer Science - Machine Learning, Computer Science - Software Engineering},\n}\n\n\n\n\n\n\n\n","author_short":["Liang, W.","Rajani, N.","Yang, X.","Ozoani, E.","Wu, E.","Chen, Y.","Smith, D. S.","Zou, J."],"key":"liang_whats_2024","id":"liang_whats_2024","bibbaseid":"liang-rajani-yang-ozoani-wu-chen-smith-zou-whatsdocumentedinaisystematicanalysisof32kaimodelcards-2024","role":"author","urls":{"Paper":"http://arxiv.org/abs/2402.05160"},"keyword":["Computer Science - Artificial Intelligence","Computer Science - Machine Learning","Computer Science - Software Engineering"],"metadata":{"authorlinks":{}}},"bibtype":"misc","biburl":"https://bibbase.org/zotero-group/schulzkx/5158478","dataSources":["JFDnASMkoQCjjGL8E"],"keywords":["computer science - artificial intelligence","computer science - machine learning","computer science - software engineering"],"search_terms":["documented","systematic","analysis","32k","model","cards","liang","rajani","yang","ozoani","wu","chen","smith","zou"],"title":"What's documented in AI? Systematic Analysis of 32K AI Model Cards","year":2024}