Training an Actor-Critic Reinforcement Learning Controller for Arm Movement Using Human-Generated Rewards

Training an Actor-Critic Reinforcement Learning Controller for Arm Movement Using Human-Generated Rewards. Jagodnik, K. M., Thomas, P. S., van den Bogert, A. J., Branicky, M. S., & Kirsch, R. F. IEEE transactions on neural systems and rehabilitation engineering: a publication of the IEEE Engineering in Medicine and Biology Society, 25(10):1892–1905, 2017.
doi abstract bibtex

Functional Electrical Stimulation (FES) employs neuroprostheses to apply electrical current to the nerves and muscles of individuals paralyzed by spinal cord injury to restore voluntary movement. Neuroprosthesis controllers calculate stimulation patterns to produce desired actions. To date, no existing controller is able to efficiently adapt its control strategy to the wide range of possible physiological arm characteristics, reaching movements, and user preferences that vary over time. Reinforcement learning (RL) is a control strategy that can incorporate human reward signals as inputs to allow human users to shape controller behavior. In this paper, ten neurologically intact human participants assigned subjective numerical rewards to train RL controllers, evaluating animations of goal-oriented reaching tasks performed using a planar musculoskeletal human arm simulation. The RL controller learning achieved using human trainers was compared with learning accomplished using human-like rewards generated by an algorithm; metrics included success at reaching the specified target; time required to reach the target; and target overshoot. Both sets of controllers learned efficiently and with minimal differences, significantly outperforming standard controllers. Reward positivity and consistency were found to be unrelated to learning success. These results suggest that human rewards can be used effectively to train RL-based FES controllers.

@article{jagodnik_training_2017,
	title = {Training an {Actor}-{Critic} {Reinforcement} {Learning} {Controller} for {Arm} {Movement} {Using} {Human}-{Generated} {Rewards}},
	volume = {25},
	copyright = {All rights reserved},
	issn = {1558-0210},
	doi = {10.1109/TNSRE.2017.2700395},
	abstract = {Functional Electrical Stimulation (FES) employs neuroprostheses to apply electrical current to the nerves and muscles of individuals paralyzed by spinal cord injury to restore voluntary movement. Neuroprosthesis controllers calculate stimulation patterns to produce desired actions. To date, no existing controller is able to efficiently adapt its control strategy to the wide range of possible physiological arm characteristics, reaching movements, and user preferences that vary over time. Reinforcement learning (RL) is a control strategy that can incorporate human reward signals as inputs to allow human users to shape controller behavior. In this paper, ten neurologically intact human participants assigned subjective numerical rewards to train RL controllers, evaluating animations of goal-oriented reaching tasks performed using a planar musculoskeletal human arm simulation. The RL controller learning achieved using human trainers was compared with learning accomplished using human-like rewards generated by an algorithm; metrics included success at reaching the specified target; time required to reach the target; and target overshoot. Both sets of controllers learned efficiently and with minimal differences, significantly outperforming standard controllers. Reward positivity and consistency were found to be unrelated to learning success. These results suggest that human rewards can be used effectively to train RL-based FES controllers.},
	language = {eng},
	number = {10},
	journal = {IEEE transactions on neural systems and rehabilitation engineering: a publication of the IEEE Engineering in Medicine and Biology Society},
	author = {Jagodnik, Kathleen M. and Thomas, Philip S. and van den Bogert, Antonie J. and Branicky, Michael S. and Kirsch, Robert F.},
	year = {2017},
	pmid = {28475063},
	pmcid = {PMC7523734},
	keywords = {Adult, Algorithms, Arm, Artificial Intelligence, Biomechanical Phenomena, Electric Stimulation Therapy, Female, Healthy Volunteers, Humans, Learning, Machine Learning, Male, Motor Skills, Movement, Neural Networks, Computer, Neural Prostheses, Reinforcement, Psychology, Reward, Shoulder},
	pages = {1892--1905},
}

Downloads: 0

{"_id":"3TJb8kqqxDjrtjmKc","bibbaseid":"jagodnik-thomas-vandenbogert-branicky-kirsch-traininganactorcriticreinforcementlearningcontrollerforarmmovementusinghumangeneratedrewards-2017","authorIDs":["v2E97Q8vDKo8rTBra"],"author_short":["Jagodnik, K. M.","Thomas, P. S.","van den Bogert, A. J.","Branicky, M. S.","Kirsch, R. F."],"bibdata":{"bibtype":"article","type":"article","title":"Training an Actor-Critic Reinforcement Learning Controller for Arm Movement Using Human-Generated Rewards","volume":"25","copyright":"All rights reserved","issn":"1558-0210","doi":"10.1109/TNSRE.2017.2700395","abstract":"Functional Electrical Stimulation (FES) employs neuroprostheses to apply electrical current to the nerves and muscles of individuals paralyzed by spinal cord injury to restore voluntary movement. Neuroprosthesis controllers calculate stimulation patterns to produce desired actions. To date, no existing controller is able to efficiently adapt its control strategy to the wide range of possible physiological arm characteristics, reaching movements, and user preferences that vary over time. Reinforcement learning (RL) is a control strategy that can incorporate human reward signals as inputs to allow human users to shape controller behavior. In this paper, ten neurologically intact human participants assigned subjective numerical rewards to train RL controllers, evaluating animations of goal-oriented reaching tasks performed using a planar musculoskeletal human arm simulation. The RL controller learning achieved using human trainers was compared with learning accomplished using human-like rewards generated by an algorithm; metrics included success at reaching the specified target; time required to reach the target; and target overshoot. Both sets of controllers learned efficiently and with minimal differences, significantly outperforming standard controllers. Reward positivity and consistency were found to be unrelated to learning success. These results suggest that human rewards can be used effectively to train RL-based FES controllers.","language":"eng","number":"10","journal":"IEEE transactions on neural systems and rehabilitation engineering: a publication of the IEEE Engineering in Medicine and Biology Society","author":[{"propositions":[],"lastnames":["Jagodnik"],"firstnames":["Kathleen","M."],"suffixes":[]},{"propositions":[],"lastnames":["Thomas"],"firstnames":["Philip","S."],"suffixes":[]},{"propositions":["van","den"],"lastnames":["Bogert"],"firstnames":["Antonie","J."],"suffixes":[]},{"propositions":[],"lastnames":["Branicky"],"firstnames":["Michael","S."],"suffixes":[]},{"propositions":[],"lastnames":["Kirsch"],"firstnames":["Robert","F."],"suffixes":[]}],"year":"2017","pmid":"28475063","pmcid":"PMC7523734","keywords":"Adult, Algorithms, Arm, Artificial Intelligence, Biomechanical Phenomena, Electric Stimulation Therapy, Female, Healthy Volunteers, Humans, Learning, Machine Learning, Male, Motor Skills, Movement, Neural Networks, Computer, Neural Prostheses, Reinforcement, Psychology, Reward, Shoulder","pages":"1892–1905","bibtex":"@article{jagodnik_training_2017,\n\ttitle = {Training an {Actor}-{Critic} {Reinforcement} {Learning} {Controller} for {Arm} {Movement} {Using} {Human}-{Generated} {Rewards}},\n\tvolume = {25},\n\tcopyright = {All rights reserved},\n\tissn = {1558-0210},\n\tdoi = {10.1109/TNSRE.2017.2700395},\n\tabstract = {Functional Electrical Stimulation (FES) employs neuroprostheses to apply electrical current to the nerves and muscles of individuals paralyzed by spinal cord injury to restore voluntary movement. Neuroprosthesis controllers calculate stimulation patterns to produce desired actions. To date, no existing controller is able to efficiently adapt its control strategy to the wide range of possible physiological arm characteristics, reaching movements, and user preferences that vary over time. Reinforcement learning (RL) is a control strategy that can incorporate human reward signals as inputs to allow human users to shape controller behavior. In this paper, ten neurologically intact human participants assigned subjective numerical rewards to train RL controllers, evaluating animations of goal-oriented reaching tasks performed using a planar musculoskeletal human arm simulation. The RL controller learning achieved using human trainers was compared with learning accomplished using human-like rewards generated by an algorithm; metrics included success at reaching the specified target; time required to reach the target; and target overshoot. Both sets of controllers learned efficiently and with minimal differences, significantly outperforming standard controllers. Reward positivity and consistency were found to be unrelated to learning success. These results suggest that human rewards can be used effectively to train RL-based FES controllers.},\n\tlanguage = {eng},\n\tnumber = {10},\n\tjournal = {IEEE transactions on neural systems and rehabilitation engineering: a publication of the IEEE Engineering in Medicine and Biology Society},\n\tauthor = {Jagodnik, Kathleen M. and Thomas, Philip S. and van den Bogert, Antonie J. and Branicky, Michael S. and Kirsch, Robert F.},\n\tyear = {2017},\n\tpmid = {28475063},\n\tpmcid = {PMC7523734},\n\tkeywords = {Adult, Algorithms, Arm, Artificial Intelligence, Biomechanical Phenomena, Electric Stimulation Therapy, Female, Healthy Volunteers, Humans, Learning, Machine Learning, Male, Motor Skills, Movement, Neural Networks, Computer, Neural Prostheses, Reinforcement, Psychology, Reward, Shoulder},\n\tpages = {1892--1905},\n}\n\n","author_short":["Jagodnik, K. M.","Thomas, P. S.","van den Bogert, A. J.","Branicky, M. S.","Kirsch, R. F."],"key":"jagodnik_training_2017","id":"jagodnik_training_2017","bibbaseid":"jagodnik-thomas-vandenbogert-branicky-kirsch-traininganactorcriticreinforcementlearningcontrollerforarmmovementusinghumangeneratedrewards-2017","role":"author","urls":{},"keyword":["Adult","Algorithms","Arm","Artificial Intelligence","Biomechanical Phenomena","Electric Stimulation Therapy","Female","Healthy Volunteers","Humans","Learning","Machine Learning","Male","Motor Skills","Movement","Neural Networks","Computer","Neural Prostheses","Reinforcement","Psychology","Reward","Shoulder"],"metadata":{"authorlinks":{"van den bogert, a":"http://chms.csuohio.edu/"}},"html":""},"bibtype":"article","biburl":"http://chms.csuohio.edu/publications/bogert.bib","creationDate":"2020-11-30T21:38:42.791Z","downloads":0,"keywords":["adult","algorithms","arm","artificial intelligence","biomechanical phenomena","electric stimulation therapy","female","healthy volunteers","humans","learning","machine learning","male","motor skills","movement","neural networks","computer","neural prostheses","reinforcement","psychology","reward","shoulder"],"search_terms":["training","actor","critic","reinforcement","learning","controller","arm","movement","using","human","generated","rewards","jagodnik","thomas","van den bogert","branicky","kirsch"],"title":"Training an Actor-Critic Reinforcement Learning Controller for Arm Movement Using Human-Generated Rewards","year":2017,"dataSources":["36uKk49GQCwzYX4LZ"]}