Reward-Based Environment States for Robot Manipulation Policy Learning. Cédérick, M., Ferrané, I., & Cuayahuitl, H. In NeurIPS 2021 Workshop on Deployable Decision Making in Embodied Systems (DDM), December, 2021.
Reward-Based Environment States for Robot Manipulation Policy Learning [link]Paper  abstract   bibtex   
Training robot manipulation policies is a challenging and open problem in robotics and artificial intelligence. In this paper we propose a novel and compact state representation based on the rewards predicted from an image-based task success classifier. Our experiments–using the Pepper robot in simulation with two deep reinforcement learning algorithms on a grab-and-lift task–reveal that our proposed state representation can achieve up to 97% task success using our best policies.
@inproceedings{lincoln47522,
       booktitle = {NeurIPS 2021 Workshop on Deployable Decision Making in Embodied Systems (DDM)},
           month = {December},
           title = {Reward-Based Environment States for Robot Manipulation Policy Learning},
          author = {Mouliets C{\'e}d{\'e}rick and Isabelle Ferran{\'e} and Heriberto Cuayahuitl},
            year = {2021},
        keywords = {ARRAY(0x56546f014e28)},
             url = {https://eprints.lincoln.ac.uk/id/eprint/47522/},
        abstract = {Training robot manipulation policies is a challenging and open problem in robotics and artificial intelligence. In this paper we propose a novel and compact state representation based on the rewards predicted from an image-based task success
classifier. Our experiments{--}using the Pepper robot in simulation with two deep reinforcement learning algorithms on a grab-and-lift task{--}reveal that our proposed state representation can achieve up to 97\% task success using our best policies.}
}

Downloads: 0