Working Memory Capacity of ChatGPT: An Empirical Study. Gong, D., Wan, X., & Wang, D. February, 2024. arXiv:2305.03731 [cs, q-bio]
Working Memory Capacity of ChatGPT: An Empirical Study [link]Paper  doi  abstract   bibtex   
Working memory is a critical aspect of both human intelligence and artificial intelligence, serving as a workspace for the temporary storage and manipulation of information. In this paper, we systematically assess the working memory capacity of ChatGPT, a large language model developed by OpenAI, by examining its performance in verbal and spatial n-back tasks under various conditions. Our experiments reveal that ChatGPT has a working memory capacity limit strikingly similar to that of humans. Furthermore, we investigate the impact of different instruction strategies on ChatGPT's performance and observe that the fundamental patterns of a capacity limit persist. From our empirical findings, we propose that n-back tasks may serve as tools for benchmarking the working memory capacity of large language models and hold potential for informing future efforts aimed at enhancing AI working memory.
@misc{gong_working_2024,
	title = {Working {Memory} {Capacity} of {ChatGPT}: {An} {Empirical} {Study}},
	shorttitle = {Working {Memory} {Capacity} of {ChatGPT}},
	url = {http://arxiv.org/abs/2305.03731},
	doi = {10.48550/arXiv.2305.03731},
	abstract = {Working memory is a critical aspect of both human intelligence and artificial intelligence, serving as a workspace for the temporary storage and manipulation of information. In this paper, we systematically assess the working memory capacity of ChatGPT, a large language model developed by OpenAI, by examining its performance in verbal and spatial n-back tasks under various conditions. Our experiments reveal that ChatGPT has a working memory capacity limit strikingly similar to that of humans. Furthermore, we investigate the impact of different instruction strategies on ChatGPT's performance and observe that the fundamental patterns of a capacity limit persist. From our empirical findings, we propose that n-back tasks may serve as tools for benchmarking the working memory capacity of large language models and hold potential for informing future efforts aimed at enhancing AI working memory.},
	urldate = {2024-02-16},
	publisher = {arXiv},
	author = {Gong, Dongyu and Wan, Xingchen and Wang, Dingmin},
	month = feb,
	year = {2024},
	note = {arXiv:2305.03731 [cs, q-bio]},
	keywords = {Computer Science - Artificial Intelligence, Computer Science - Computation and Language, Quantitative Biology - Neurons and Cognition},
}

Downloads: 0