Towards the Integration of Reinforcement Learning into MASPY. Mellado, A. L. L., Borges, A. P., Cardoso, R. C., & Alves, G. V. In Workshop-Escola de Sistemas de Agentes, seus Ambientes e Aplicações (WESAAC), pages 21–28, September, 2025.
Towards the Integration of Reinforcement Learning into MASPY [link]Paper  doi  abstract   bibtex   
Learning in symbolic agent architectures remains a key challenge in the development of adaptive multi-agent systems. This paper introduces a learning module for MASPY, a Python-based framework inspired by the Belief-Desire-Intention (BDI) model. The module enables agents to learn optimal actions using tabular reinforcement learning algorithms, such as Q-Learning and SARSA. To support this, we propose the SART methodology, which decomposes the learning environment into four structured components: States, Actions, Rewards, and Transitions. This structure allows MASPY agents to perceive their environment through defined percepts, act through decorated functions, and adapt over time using discrete learning strategies. The learning module offers a unified Python-based architecture for symbolic reasoning agents that learn through reinforcement training. This is shown practically with a toy problem where agents are able to learn to execute the actions of a previously unknown environment.
@inproceedings{mellado_towards_2025,
	title = {Towards the {Integration} of {Reinforcement} {Learning} into {MASPY}},
	copyright = {Copyright (c)},
	issn = {2326-5434},
	url = {https://sol.sbc.org.br/index.php/wesaac/article/view/37544},
	doi = {10.5753/wesaac.2025.37544},
	abstract = {Learning in symbolic agent architectures remains a key challenge in the development of adaptive multi-agent systems. This paper introduces a learning module for MASPY, a Python-based framework inspired by the Belief-Desire-Intention (BDI) model. The module enables agents to learn optimal actions using tabular reinforcement learning algorithms, such as Q-Learning and SARSA. To support this, we propose the SART methodology, which decomposes the learning environment into four structured components: States, Actions, Rewards, and Transitions. This structure allows MASPY agents to perceive their environment through defined percepts, act through decorated functions, and adapt over time using discrete learning strategies. The learning module offers a unified Python-based architecture for symbolic reasoning agents that learn through reinforcement training. This is shown practically with a toy problem where agents are able to learn to execute the actions of a previously unknown environment.},
	language = {en},
	urldate = {2026-03-27},
	booktitle = {Workshop-{Escola} de {Sistemas} de {Agentes}, seus {Ambientes} e {Aplicações} ({WESAAC})},
	author = {Mellado, Alexandre L. L. and Borges, André Pinz and Cardoso, Rafael C. and Alves, Gleifer Vaz},
	month = sep,
	year = {2025},
	pages = {21--28},
}

Downloads: 0