Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments

Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments. Lowe, R., Wu, Y., Tamar, A., Harb, J., Abbeel, P., & Mordatch, I. arXiv:1706.02275 [cs], June, 2017. arXiv: 1706.02275

Paper abstract bibtex

We explore deep reinforcement learning methods for multi-agent domains. We begin by analyzing the difﬁculty of traditional algorithms in the multi-agent case: Q-learning is challenged by an inherent non-stationarity of the environment, while policy gradient suffers from a variance that increases as the number of agents grows. We then present an adaptation of actor-critic methods that considers action policies of other agents and is able to successfully learn policies that require complex multiagent coordination. Additionally, we introduce a training regimen utilizing an ensemble of policies for each agent that leads to more robust multi-agent policies. We show the strength of our approach compared to existing methods in cooperative as well as competitive scenarios, where agent populations are able to discover various physical and informational coordination strategies.

@article{lowe_multi-agent_2017,
	title = {Multi-{Agent} {Actor}-{Critic} for {Mixed} {Cooperative}-{Competitive} {Environments}},
	url = {http://arxiv.org/abs/1706.02275},
	abstract = {We explore deep reinforcement learning methods for multi-agent domains. We begin by analyzing the difﬁculty of traditional algorithms in the multi-agent case: Q-learning is challenged by an inherent non-stationarity of the environment, while policy gradient suffers from a variance that increases as the number of agents grows. We then present an adaptation of actor-critic methods that considers action policies of other agents and is able to successfully learn policies that require complex multiagent coordination. Additionally, we introduce a training regimen utilizing an ensemble of policies for each agent that leads to more robust multi-agent policies. We show the strength of our approach compared to existing methods in cooperative as well as competitive scenarios, where agent populations are able to discover various physical and informational coordination strategies.},
	language = {en},
	urldate = {2019-04-21},
	journal = {arXiv:1706.02275 [cs]},
	author = {Lowe, Ryan and Wu, Yi and Tamar, Aviv and Harb, Jean and Abbeel, Pieter and Mordatch, Igor},
	month = jun,
	year = {2017},
	note = {arXiv: 1706.02275},
	keywords = {Computer Science - Artificial Intelligence, Computer Science - Machine Learning, Computer Science - Neural and Evolutionary Computing}
}

Downloads: 0

{"_id":"de9kNJy39scYnDcmR","bibbaseid":"lowe-wu-tamar-harb-abbeel-mordatch-multiagentactorcriticformixedcooperativecompetitiveenvironments-2017","downloads":0,"creationDate":"2017-08-10T22:00:23.695Z","title":"Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments","author_short":["Lowe, R.","Wu, Y.","Tamar, A.","Harb, J.","Abbeel, P.","Mordatch, I."],"year":2017,"bibtype":"article","biburl":"https://bibbase.org/zotero/asneha213","bibdata":{"bibtype":"article","type":"article","title":"Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments","url":"http://arxiv.org/abs/1706.02275","abstract":"We explore deep reinforcement learning methods for multi-agent domains. We begin by analyzing the difﬁculty of traditional algorithms in the multi-agent case: Q-learning is challenged by an inherent non-stationarity of the environment, while policy gradient suffers from a variance that increases as the number of agents grows. We then present an adaptation of actor-critic methods that considers action policies of other agents and is able to successfully learn policies that require complex multiagent coordination. Additionally, we introduce a training regimen utilizing an ensemble of policies for each agent that leads to more robust multi-agent policies. We show the strength of our approach compared to existing methods in cooperative as well as competitive scenarios, where agent populations are able to discover various physical and informational coordination strategies.","language":"en","urldate":"2019-04-21","journal":"arXiv:1706.02275 [cs]","author":[{"propositions":[],"lastnames":["Lowe"],"firstnames":["Ryan"],"suffixes":[]},{"propositions":[],"lastnames":["Wu"],"firstnames":["Yi"],"suffixes":[]},{"propositions":[],"lastnames":["Tamar"],"firstnames":["Aviv"],"suffixes":[]},{"propositions":[],"lastnames":["Harb"],"firstnames":["Jean"],"suffixes":[]},{"propositions":[],"lastnames":["Abbeel"],"firstnames":["Pieter"],"suffixes":[]},{"propositions":[],"lastnames":["Mordatch"],"firstnames":["Igor"],"suffixes":[]}],"month":"June","year":"2017","note":"arXiv: 1706.02275","keywords":"Computer Science - Artificial Intelligence, Computer Science - Machine Learning, Computer Science - Neural and Evolutionary Computing","bibtex":"@article{lowe_multi-agent_2017,\n\ttitle = {Multi-{Agent} {Actor}-{Critic} for {Mixed} {Cooperative}-{Competitive} {Environments}},\n\turl = {http://arxiv.org/abs/1706.02275},\n\tabstract = {We explore deep reinforcement learning methods for multi-agent domains. We begin by analyzing the difﬁculty of traditional algorithms in the multi-agent case: Q-learning is challenged by an inherent non-stationarity of the environment, while policy gradient suffers from a variance that increases as the number of agents grows. We then present an adaptation of actor-critic methods that considers action policies of other agents and is able to successfully learn policies that require complex multiagent coordination. Additionally, we introduce a training regimen utilizing an ensemble of policies for each agent that leads to more robust multi-agent policies. We show the strength of our approach compared to existing methods in cooperative as well as competitive scenarios, where agent populations are able to discover various physical and informational coordination strategies.},\n\tlanguage = {en},\n\turldate = {2019-04-21},\n\tjournal = {arXiv:1706.02275 [cs]},\n\tauthor = {Lowe, Ryan and Wu, Yi and Tamar, Aviv and Harb, Jean and Abbeel, Pieter and Mordatch, Igor},\n\tmonth = jun,\n\tyear = {2017},\n\tnote = {arXiv: 1706.02275},\n\tkeywords = {Computer Science - Artificial Intelligence, Computer Science - Machine Learning, Computer Science - Neural and Evolutionary Computing}\n}\n\n","author_short":["Lowe, R.","Wu, Y.","Tamar, A.","Harb, J.","Abbeel, P.","Mordatch, I."],"key":"lowe_multi-agent_2017","id":"lowe_multi-agent_2017","bibbaseid":"lowe-wu-tamar-harb-abbeel-mordatch-multiagentactorcriticformixedcooperativecompetitiveenvironments-2017","role":"author","urls":{"Paper":"http://arxiv.org/abs/1706.02275"},"keyword":["Computer Science - Artificial Intelligence","Computer Science - Machine Learning","Computer Science - Neural and Evolutionary Computing"],"downloads":0,"html":""},"search_terms":["multi","agent","actor","critic","mixed","cooperative","competitive","environments","lowe","wu","tamar","harb","abbeel","mordatch"],"keywords":["computer science - artificial intelligence","computer science - machine learning","computer science - neural and evolutionary computing"],"authorIDs":["598cd7776b2f005d0c000004"],"dataSources":["fjacg9txEnNSDwee6"]}