End-to-End Policy Gradient Method for POMDPs and Explainable Agents. Nishimori, S., Koyamada, S., & Ishii, S. April, 2023. arXiv:2304.09769 [cs]
End-to-End Policy Gradient Method for POMDPs and Explainable Agents [link]Paper  abstract   bibtex   
Real-world decision-making problems are often partially observable, and many can be formulated as a Partially Observable Markov Decision Process (POMDP). When we apply reinforcement learning (RL) algorithms to the POMDP, reasonable estimation of the hidden states can help solve the problems. Furthermore, explainable decision-making is preferable, considering their application to realworld tasks such as autonomous driving cars. We proposed an RL algorithm that estimates the hidden states by end-to-end training, and visualize the estimation as a state-transition graph. Experimental results demonstrated that the proposed algorithm can solve simple POMDP problems and that the visualization makes the agent’s behavior interpretable to humans.
@misc{nishimori_end--end_2023,
	title = {End-to-{End} {Policy} {Gradient} {Method} for {POMDPs} and {Explainable} {Agents}},
	url = {http://arxiv.org/abs/2304.09769},
	abstract = {Real-world decision-making problems are often partially observable, and many can be formulated as a Partially Observable Markov Decision Process (POMDP). When we apply reinforcement learning (RL) algorithms to the POMDP, reasonable estimation of the hidden states can help solve the problems. Furthermore, explainable decision-making is preferable, considering their application to realworld tasks such as autonomous driving cars. We proposed an RL algorithm that estimates the hidden states by end-to-end training, and visualize the estimation as a state-transition graph. Experimental results demonstrated that the proposed algorithm can solve simple POMDP problems and that the visualization makes the agent’s behavior interpretable to humans.},
	language = {en},
	urldate = {2023-04-24},
	publisher = {arXiv},
	author = {Nishimori, Soichiro and Koyamada, Sotetsu and Ishii, Shin},
	month = apr,
	year = {2023},
	note = {arXiv:2304.09769 [cs]},
	keywords = {Computer Science - Artificial Intelligence},
}

Downloads: 0