Module-Based Reinforcement Learning: Experiments with a Real Robot

Module-Based Reinforcement Learning: Experiments with a Real Robot. Kalmár, Z., Szepesvári, C., & Lörincz, A. Machine Learning, 31:1– 2, 1998. Also appeared as: Z. Kalmár, C. Szepesvári, and A. Lorincz. Module-based reinforcement learning: Experiments with a real robot. Autonomous Robots, 5:273–295, 1998.

Paper abstract bibtex 1 download

The behavior of reinforcement learning (RL) algorithms is best understood in completely observable, discrete-time controlled Markov chains with finite state and action spaces. In contrast, robot-learning domains are inherently continuous both in time and space, and moreover are partially observable. Here we suggest a systematic approach to solve such problems in which the available qualitative and quantitative knowledge is used to reduce the complexity of learning task. The steps of the design process are to: i) decompose the task into subtasks using the qualitative knowledge at hand; ii) design local controllers to solve the subtasks using the available quantitative knowledge and iii) learn a coordination of these controllers by means of reinforcement learning. It is argued that the approach enables fast, semi-automatic, but still high quality robot-control as no fine-tuning of the local controllers is needed. The approach was verified on a non-trivial real-life robot task. Several RL algorithms were compared by ANOVA and it was found that the model-based approach worked significantly better than the model-free approach. The learnt switching strategy performed comparably to a handcrafted version. Moreover, the learnt strategy seemed to exploit certain properties of the environment which were not foreseen in advance, thus supporting the view that adaptive algorithms are advantageous to non-adaptive ones in complex environments.

@article{zs.kalmar1998a,
	abstract = {The behavior of reinforcement learning (RL) algorithms is best understood in completely observable, discrete-time controlled Markov chains with finite state and action spaces. In contrast, robot-learning domains are inherently continuous both in time and space, and moreover are partially observable. Here we suggest a systematic approach to solve such problems in which the available qualitative and quantitative knowledge is used to reduce the complexity of learning task. The steps of the design process are to: i) decompose the task into subtasks using the qualitative knowledge at hand; ii) design local controllers to solve the subtasks using the available quantitative knowledge and iii) learn a coordination of these controllers by means of reinforcement learning. It is argued that the approach enables fast, semi-automatic, but still high quality robot-control as no fine-tuning of the local controllers is needed. The approach was verified on a non-trivial real-life robot task. Several RL algorithms were compared by ANOVA and it was found that the model-based approach worked significantly better than the model-free approach. The learnt switching strategy performed comparably to a handcrafted version. Moreover, the learnt strategy seemed to exploit certain properties of the environment which were not foreseen in advance, thus supporting the view that adaptive algorithms are advantageous to non-adaptive ones in complex environments.},
	author = {Kalm{\'a}r, Zs. and Szepesv{\'a}ri, Cs. and L{\"o}rincz, A.},
	date-modified = {2010-09-02 13:09:16 -0600},
	journal = {Machine Learning},
	keywords = {robotics, application, hierarchical reinforcement learning, reinforcement learning, macro learning, theory},
	note = {Also appeared as: Z. Kalm{\'a}r, C. Szepesv{\'a}ri, and A. Lorincz. Module-based reinforcement learning: Experiments with a real robot. Autonomous Robots, 5:273--295, 1998.},
	owner = {Beata},
	pages = {1-- 2},
	timestamp = {2010.08.30},
	title = {Module-Based Reinforcement Learning: Experiments with a Real Robot},
	url_paper = {ml-98.ps.pdf},
	volume = {31},
	year = {1998}}

Downloads: 1