Seeing in Words: Learning to Classify through Language Bottlenecks

Seeing in Words: Learning to Classify through Language Bottlenecks. Saifullah, K., Wen, Y., Geiping, J., Goldblum, M., & Goldstein, T. June, 2023. arXiv:2307.00028 [cs]

Paper doi abstract bibtex

Neural networks for computer vision extract uninterpretable features despite achieving high accuracy on benchmarks. In contrast, humans can explain their predictions using succinct and intuitive descriptions. To incorporate explainability into neural networks, we train a vision model whose feature representations are text. We show that such a model can effectively classify ImageNet images, and we discuss the challenges we encountered when training it.

@misc{saifullah_seeing_2023,
	title = {Seeing in {Words}: {Learning} to {Classify} through {Language} {Bottlenecks}},
	shorttitle = {Seeing in {Words}},
	url = {http://arxiv.org/abs/2307.00028},
	doi = {10.48550/arXiv.2307.00028},
	abstract = {Neural networks for computer vision extract uninterpretable features despite achieving high accuracy on benchmarks. In contrast, humans can explain their predictions using succinct and intuitive descriptions. To incorporate explainability into neural networks, we train a vision model whose feature representations are text. We show that such a model can effectively classify ImageNet images, and we discuss the challenges we encountered when training it.},
	urldate = {2024-05-02},
	publisher = {arXiv},
	author = {Saifullah, Khalid and Wen, Yuxin and Geiping, Jonas and Goldblum, Micah and Goldstein, Tom},
	month = jun,
	year = {2023},
	note = {arXiv:2307.00028 [cs]},
	keywords = {Computer Science - Artificial Intelligence, Computer Science - Computation and Language, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning},
}

Downloads: 0

{"_id":"DhGpEHa3hnNv4tMZC","bibbaseid":"saifullah-wen-geiping-goldblum-goldstein-seeinginwordslearningtoclassifythroughlanguagebottlenecks-2023","author_short":["Saifullah, K.","Wen, Y.","Geiping, J.","Goldblum, M.","Goldstein, T."],"bibdata":{"bibtype":"misc","type":"misc","title":"Seeing in Words: Learning to Classify through Language Bottlenecks","shorttitle":"Seeing in Words","url":"http://arxiv.org/abs/2307.00028","doi":"10.48550/arXiv.2307.00028","abstract":"Neural networks for computer vision extract uninterpretable features despite achieving high accuracy on benchmarks. In contrast, humans can explain their predictions using succinct and intuitive descriptions. To incorporate explainability into neural networks, we train a vision model whose feature representations are text. We show that such a model can effectively classify ImageNet images, and we discuss the challenges we encountered when training it.","urldate":"2024-05-02","publisher":"arXiv","author":[{"propositions":[],"lastnames":["Saifullah"],"firstnames":["Khalid"],"suffixes":[]},{"propositions":[],"lastnames":["Wen"],"firstnames":["Yuxin"],"suffixes":[]},{"propositions":[],"lastnames":["Geiping"],"firstnames":["Jonas"],"suffixes":[]},{"propositions":[],"lastnames":["Goldblum"],"firstnames":["Micah"],"suffixes":[]},{"propositions":[],"lastnames":["Goldstein"],"firstnames":["Tom"],"suffixes":[]}],"month":"June","year":"2023","note":"arXiv:2307.00028 [cs]","keywords":"Computer Science - Artificial Intelligence, Computer Science - Computation and Language, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning","bibtex":"@misc{saifullah_seeing_2023,\n\ttitle = {Seeing in {Words}: {Learning} to {Classify} through {Language} {Bottlenecks}},\n\tshorttitle = {Seeing in {Words}},\n\turl = {http://arxiv.org/abs/2307.00028},\n\tdoi = {10.48550/arXiv.2307.00028},\n\tabstract = {Neural networks for computer vision extract uninterpretable features despite achieving high accuracy on benchmarks. In contrast, humans can explain their predictions using succinct and intuitive descriptions. To incorporate explainability into neural networks, we train a vision model whose feature representations are text. We show that such a model can effectively classify ImageNet images, and we discuss the challenges we encountered when training it.},\n\turldate = {2024-05-02},\n\tpublisher = {arXiv},\n\tauthor = {Saifullah, Khalid and Wen, Yuxin and Geiping, Jonas and Goldblum, Micah and Goldstein, Tom},\n\tmonth = jun,\n\tyear = {2023},\n\tnote = {arXiv:2307.00028 [cs]},\n\tkeywords = {Computer Science - Artificial Intelligence, Computer Science - Computation and Language, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning},\n}\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n","author_short":["Saifullah, K.","Wen, Y.","Geiping, J.","Goldblum, M.","Goldstein, T."],"key":"saifullah_seeing_2023-1","id":"saifullah_seeing_2023-1","bibbaseid":"saifullah-wen-geiping-goldblum-goldstein-seeinginwordslearningtoclassifythroughlanguagebottlenecks-2023","role":"author","urls":{"Paper":"http://arxiv.org/abs/2307.00028"},"keyword":["Computer Science - Artificial Intelligence","Computer Science - Computation and Language","Computer Science - Computer Vision and Pattern Recognition","Computer Science - Machine Learning"],"metadata":{"authorlinks":{}},"html":""},"bibtype":"misc","biburl":"https://bibbase.org/zotero-group/dcambrid/5266609","dataSources":["e4qi3jRmPhPzc7C9a"],"keywords":["computer science - artificial intelligence","computer science - computation and language","computer science - computer vision and pattern recognition","computer science - machine learning"],"search_terms":["seeing","words","learning","classify","through","language","bottlenecks","saifullah","wen","geiping","goldblum","goldstein"],"title":"Seeing in Words: Learning to Classify through Language Bottlenecks","year":2023}