Emergent Collective Behaviors in a Multi-agent Reinforcement Learning Pedestrian Simulation: A Case Study

Emergent Collective Behaviors in a Multi-agent Reinforcement Learning Pedestrian Simulation: A Case Study. Martinez-Gil, F., Lozano, M., & Fernández, F. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), volume 9002, pages 228–238. 2015.

Paper doi abstract bibtex

© Springer International Publishing Switzerland 2015.In this work, a Multi-agent Reinforcement Learning framework is used to generate simulations of virtual pedestrians groups. The aim is to study the influence of two different learning approaches in the quality of generated simulations. The case of study consists on the simulation of the crossing of two groups of embodied virtual agents inside a narrow corridor. This scenario is a classic experiment inside the pedestrian modeling area, because a collective behavior, specifically the lanes formation, emerges with real pedestrians. The paper studies the influence of different learning algorithms, function approximation approaches, and knowledge transfer mechanisms on performance of learned pedestrian behaviors. Specifically, two different RL-based schemas are analyzed. The first one, Iterative Vector Quantization with Q-Learning (ITVQQL), improves iteratively a state-space generalizer based on vector quantization. The second scheme, named TS, uses tile coding as the generalization method with the Sarsa($λ$) algorithm. Knowledge transfer approach is based on the use of Probabilistic Policy Reuse to incorporate previously acquired knowledge in current learning processes; additionally, value function transfer is also used in the ITVQQL schema to transfer the value function between consecutive iterations. Results demonstrate empirically that our RL framework generates individual behaviors capable of emerging the expected collective behavior as occurred in real pedestrians. This collective behavior appears independently of the learning algorithm and the generalization method used, but depends extremely on whether knowledge transfer was applied or not. In addition, the use of transfer techniques has a remarkable influence in the final performance (measured in number of times that the task was solved) of the learned behaviors.

@incollection{Martinez-Gil2015,
abstract = {{\textcopyright} Springer International Publishing Switzerland 2015.In this work, a Multi-agent Reinforcement Learning framework is used to generate simulations of virtual pedestrians groups. The aim is to study the influence of two different learning approaches in the quality of generated simulations. The case of study consists on the simulation of the crossing of two groups of embodied virtual agents inside a narrow corridor. This scenario is a classic experiment inside the pedestrian modeling area, because a collective behavior, specifically the lanes formation, emerges with real pedestrians. The paper studies the influence of different learning algorithms, function approximation approaches, and knowledge transfer mechanisms on performance of learned pedestrian behaviors. Specifically, two different RL-based schemas are analyzed. The first one, Iterative Vector Quantization with Q-Learning (ITVQQL), improves iteratively a state-space generalizer based on vector quantization. The second scheme, named TS, uses tile coding as the generalization method with the Sarsa($\lambda$) algorithm. Knowledge transfer approach is based on the use of Probabilistic Policy Reuse to incorporate previously acquired knowledge in current learning processes; additionally, value function transfer is also used in the ITVQQL schema to transfer the value function between consecutive iterations. Results demonstrate empirically that our RL framework generates individual behaviors capable of emerging the expected collective behavior as occurred in real pedestrians. This collective behavior appears independently of the learning algorithm and the generalization method used, but depends extremely on whether knowledge transfer was applied or not. In addition, the use of transfer techniques has a remarkable influence in the final performance (measured in number of times that the task was solved) of the learned behaviors.},
author = {Martinez-Gil, Francisco and Lozano, Miguel and Fern{\'{a}}ndez, Fernando},
booktitle = {Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)},
doi = {10.1007/978-3-319-14627-0_16},
file = {:home/fernando/papers/tmp/10.1007{\%}2F978-3-319-14627-0{\_}16.pdf:pdf},
isbn = {9783319146263},
issn = {16113349},
keywords = {Pedestrians simulation,Policy Reus,Tile coding,Transfer learning,Vector Quantization},
pages = {228--238},
title = {{Emergent Collective Behaviors in a Multi-agent Reinforcement Learning Pedestrian Simulation: A Case Study}},
url = {http://link.springer.com/10.1007/978-3-319-14627-0{\_}16},
volume = {9002},
year = {2015}
}

Downloads: 0

{"_id":"NcpRZcXy9DDQmbnnF","bibbaseid":"martinezgil-lozano-fernndez-emergentcollectivebehaviorsinamultiagentreinforcementlearningpedestriansimulationacasestudy-2015","downloads":0,"creationDate":"2018-11-13T17:04:18.530Z","title":"Emergent Collective Behaviors in a Multi-agent Reinforcement Learning Pedestrian Simulation: A Case Study","author_short":["Martinez-Gil, F.","Lozano, M.","Fernández, F."],"year":2015,"bibtype":"incollection","biburl":"http://www.plg.inf.uc3m.es/bib/publistplg.bib","bibdata":{"bibtype":"incollection","type":"incollection","abstract":"© Springer International Publishing Switzerland 2015.In this work, a Multi-agent Reinforcement Learning framework is used to generate simulations of virtual pedestrians groups. The aim is to study the influence of two different learning approaches in the quality of generated simulations. The case of study consists on the simulation of the crossing of two groups of embodied virtual agents inside a narrow corridor. This scenario is a classic experiment inside the pedestrian modeling area, because a collective behavior, specifically the lanes formation, emerges with real pedestrians. The paper studies the influence of different learning algorithms, function approximation approaches, and knowledge transfer mechanisms on performance of learned pedestrian behaviors. Specifically, two different RL-based schemas are analyzed. The first one, Iterative Vector Quantization with Q-Learning (ITVQQL), improves iteratively a state-space generalizer based on vector quantization. The second scheme, named TS, uses tile coding as the generalization method with the Sarsa($λ$) algorithm. Knowledge transfer approach is based on the use of Probabilistic Policy Reuse to incorporate previously acquired knowledge in current learning processes; additionally, value function transfer is also used in the ITVQQL schema to transfer the value function between consecutive iterations. Results demonstrate empirically that our RL framework generates individual behaviors capable of emerging the expected collective behavior as occurred in real pedestrians. This collective behavior appears independently of the learning algorithm and the generalization method used, but depends extremely on whether knowledge transfer was applied or not. In addition, the use of transfer techniques has a remarkable influence in the final performance (measured in number of times that the task was solved) of the learned behaviors.","author":[{"propositions":[],"lastnames":["Martinez-Gil"],"firstnames":["Francisco"],"suffixes":[]},{"propositions":[],"lastnames":["Lozano"],"firstnames":["Miguel"],"suffixes":[]},{"propositions":[],"lastnames":["Fernández"],"firstnames":["Fernando"],"suffixes":[]}],"booktitle":"Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)","doi":"10.1007/978-3-319-14627-0_16","file":":home/fernando/papers/tmp/10.1007%2F978-3-319-14627-0_16.pdf:pdf","isbn":"9783319146263","issn":"16113349","keywords":"Pedestrians simulation,Policy Reus,Tile coding,Transfer learning,Vector Quantization","pages":"228–238","title":"Emergent Collective Behaviors in a Multi-agent Reinforcement Learning Pedestrian Simulation: A Case Study","url":"http://link.springer.com/10.1007/978-3-319-14627-0\\_16","volume":"9002","year":"2015","bibtex":"@incollection{Martinez-Gil2015,\nabstract = {{\\textcopyright} Springer International Publishing Switzerland 2015.In this work, a Multi-agent Reinforcement Learning framework is used to generate simulations of virtual pedestrians groups. The aim is to study the influence of two different learning approaches in the quality of generated simulations. The case of study consists on the simulation of the crossing of two groups of embodied virtual agents inside a narrow corridor. This scenario is a classic experiment inside the pedestrian modeling area, because a collective behavior, specifically the lanes formation, emerges with real pedestrians. The paper studies the influence of different learning algorithms, function approximation approaches, and knowledge transfer mechanisms on performance of learned pedestrian behaviors. Specifically, two different RL-based schemas are analyzed. The first one, Iterative Vector Quantization with Q-Learning (ITVQQL), improves iteratively a state-space generalizer based on vector quantization. The second scheme, named TS, uses tile coding as the generalization method with the Sarsa($\\lambda$) algorithm. Knowledge transfer approach is based on the use of Probabilistic Policy Reuse to incorporate previously acquired knowledge in current learning processes; additionally, value function transfer is also used in the ITVQQL schema to transfer the value function between consecutive iterations. Results demonstrate empirically that our RL framework generates individual behaviors capable of emerging the expected collective behavior as occurred in real pedestrians. This collective behavior appears independently of the learning algorithm and the generalization method used, but depends extremely on whether knowledge transfer was applied or not. In addition, the use of transfer techniques has a remarkable influence in the final performance (measured in number of times that the task was solved) of the learned behaviors.},\nauthor = {Martinez-Gil, Francisco and Lozano, Miguel and Fern{\\'{a}}ndez, Fernando},\nbooktitle = {Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)},\ndoi = {10.1007/978-3-319-14627-0_16},\nfile = {:home/fernando/papers/tmp/10.1007{\\%}2F978-3-319-14627-0{\\_}16.pdf:pdf},\nisbn = {9783319146263},\nissn = {16113349},\nkeywords = {Pedestrians simulation,Policy Reus,Tile coding,Transfer learning,Vector Quantization},\npages = {228--238},\ntitle = {{Emergent Collective Behaviors in a Multi-agent Reinforcement Learning Pedestrian Simulation: A Case Study}},\nurl = {http://link.springer.com/10.1007/978-3-319-14627-0{\\_}16},\nvolume = {9002},\nyear = {2015}\n}\n","author_short":["Martinez-Gil, F.","Lozano, M.","Fernández, F."],"key":"Martinez-Gil2015","id":"Martinez-Gil2015","bibbaseid":"martinezgil-lozano-fernndez-emergentcollectivebehaviorsinamultiagentreinforcementlearningpedestriansimulationacasestudy-2015","role":"author","urls":{"Paper":"http://link.springer.com/10.1007/978-3-319-14627-0\\_16"},"keyword":["Pedestrians simulation","Policy Reus","Tile coding","Transfer learning","Vector Quantization"],"metadata":{"authorlinks":{}},"downloads":0,"html":""},"search_terms":["emergent","collective","behaviors","multi","agent","reinforcement","learning","pedestrian","simulation","case","study","martinez-gil","lozano","fernández"],"keywords":["pedestrians simulation","policy reus","tile coding","transfer learning","vector quantization"],"authorIDs":[],"dataSources":["6rjpjXD4NNhQBc3tX","eeebSGk8rTW4cn9yN"]}