\n \n \n
\n
\n\n \n \n \n \n \n \n Bayesian Optimisation for Safe Navigation Under Localisation Uncertainty.\n \n \n \n \n\n\n \n Oliveira, R.; Ott, L.; Guizilini, V.; and Ramos, F.\n\n\n \n\n\n\n Robotics Research: The 18th International Symposium ISRR, pages 489-504. Amato, N., M.; Hager, G.; Thomas, S.; and Torres-Torriti, M., editor(s). Springer International Publishing, 2020.\n
\n\n
\n\n
\n\n
\n\n \n \n
Website\n \n \n\n \n\n \n link\n \n \n\n bibtex\n \n\n \n \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n \n \n \n \n \n \n \n\n\n\n
\n
@inbook{\n type = {inbook},\n year = {2020},\n pages = {489-504},\n websites = {https://link.springer.com/chapter/10.1007/978-3-030-28619-4_37},\n publisher = {Springer International Publishing},\n city = {Cham},\n id = {c513ae20-a872-31f3-ac17-b5a45a0f5b3a},\n created = {2020-01-15T00:00:13.578Z},\n file_attached = {false},\n profile_id = {20eea928-2566-3876-901d-9a50fe4a71d0},\n last_modified = {2020-05-10T15:32:09.412Z},\n read = {false},\n starred = {false},\n authored = {true},\n confirmed = {true},\n hidden = {false},\n citation_key = {10.1007/978-3-030-28619-4_37},\n source_type = {inproceedings},\n private_publication = {false},\n abstract = {In outdoor environments, mobile robots are required to navigate through terrain with varying characteristics, some of which might significantly affect the integrity of the platform. Ideally, the robot should be able to identify areas that are safe for navigation based on its own percepts about the environment while avoiding damage to itself. Bayesian optimisation (BO) has been successfully applied to the task of learning a model of terrain traversability while guiding the robot through more traversable areas. An issue, however, is that localisation uncertainty can end up guiding the robot to unsafe areas and distort the model being learnt. In this paper, we address this problem and present a novel method that allows BO to consider localisation uncertainty by applying a Gaussian process model for uncertain inputs as a prior. We evaluate the proposed method in simulation and in experiments with a real robot navigating over rough terrain and compare it against standard BO methods.},\n bibtype = {inbook},\n author = {Oliveira, Rafael and Ott, Lionel and Guizilini, Vitor and Ramos, Fabio},\n editor = {Amato, Nancy M and Hager, Greg and Thomas, Shawna and Torres-Torriti, Miguel},\n chapter = {Bayesian Optimisation for Safe Navigation Under Localisation Uncertainty},\n title = {Robotics Research: The 18th International Symposium ISRR},\n keywords = {Bayesian optimisation,uncertain inputs}\n}
\n
\n\n\n
\n In outdoor environments, mobile robots are required to navigate through terrain with varying characteristics, some of which might significantly affect the integrity of the platform. Ideally, the robot should be able to identify areas that are safe for navigation based on its own percepts about the environment while avoiding damage to itself. Bayesian optimisation (BO) has been successfully applied to the task of learning a model of terrain traversability while guiding the robot through more traversable areas. An issue, however, is that localisation uncertainty can end up guiding the robot to unsafe areas and distort the model being learnt. In this paper, we address this problem and present a novel method that allows BO to consider localisation uncertainty by applying a Gaussian process model for uncertain inputs as a prior. We evaluate the proposed method in simulation and in experiments with a real robot navigating over rough terrain and compare it against standard BO methods.\n
\n\n\n
\n\n\n
\n
\n\n \n \n \n \n \n \n DISCO: Double likelihood-free Inference Stochastic Control.\n \n \n \n \n\n\n \n Barcelos, L.; Oliveira, R.; Possas, R.; Ott, L.; and Ramos, F.\n\n\n \n\n\n\n In
2020 IEEE International Conference on Robotics and Automation (ICRA), 2020. IEEE\n
\n\n
\n\n
\n\n
\n\n \n \n
Website\n \n \n\n \n\n \n link\n \n \n\n bibtex\n \n\n \n \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n \n \n \n \n \n \n \n\n\n\n
\n
@inproceedings{\n title = {DISCO: Double likelihood-free Inference Stochastic Control},\n type = {inproceedings},\n year = {2020},\n websites = {http://arxiv.org/abs/2002.07379},\n publisher = {IEEE},\n city = {Paris, France},\n id = {7d194e68-57cc-366a-a9c1-b56ee6959568},\n created = {2020-02-11T07:34:39.884Z},\n file_attached = {false},\n profile_id = {20eea928-2566-3876-901d-9a50fe4a71d0},\n last_modified = {2020-10-18T23:55:28.036Z},\n read = {false},\n starred = {false},\n authored = {true},\n confirmed = {true},\n hidden = {false},\n citation_key = {Barcelos2020},\n private_publication = {false},\n abstract = {Accurate simulation of complex physical systems enables the development, testing, and certification of control strategies before they are deployed into the real systems. As simulators become more advanced, the analytical tractability of the differential equations and associated numerical solvers incorporated in the simulations diminishes, making them difficult to analyse. A potential solution is the use of probabilistic inference to assess the uncertainty of the simulation parameters given real observations of the system. Unfortunately the likelihood function required for inference is generally expensive to compute or totally intractable. In this paper we propose to leverage the power of modern simulators and recent techniques in Bayesian statistics for likelihood-free inference to design a control framework that is efficient and robust with respect to the uncertainty over simulation parameters. The posterior distribution over simulation parameters is propagated through a potentially non-analytical model of the system with the unscented transform, and a variant of the information theoretical model predictive control. This approach provides a more efficient way to evaluate trajectory roll outs than Monte Carlo sampling, reducing the online computation burden. Experiments show that the controller proposed attained superior performance and robustness on classical control and robotics tasks when compared to models not accounting for the uncertainty over model parameters.},\n bibtype = {inproceedings},\n author = {Barcelos, Lucas and Oliveira, Rafael and Possas, Rafael and Ott, Lionel and Ramos, Fabio},\n booktitle = {2020 IEEE International Conference on Robotics and Automation (ICRA)},\n keywords = {approximate inference,model predictive control}\n}
\n
\n\n\n
\n Accurate simulation of complex physical systems enables the development, testing, and certification of control strategies before they are deployed into the real systems. As simulators become more advanced, the analytical tractability of the differential equations and associated numerical solvers incorporated in the simulations diminishes, making them difficult to analyse. A potential solution is the use of probabilistic inference to assess the uncertainty of the simulation parameters given real observations of the system. Unfortunately the likelihood function required for inference is generally expensive to compute or totally intractable. In this paper we propose to leverage the power of modern simulators and recent techniques in Bayesian statistics for likelihood-free inference to design a control framework that is efficient and robust with respect to the uncertainty over simulation parameters. The posterior distribution over simulation parameters is propagated through a potentially non-analytical model of the system with the unscented transform, and a variant of the information theoretical model predictive control. This approach provides a more efficient way to evaluate trajectory roll outs than Monte Carlo sampling, reducing the online computation burden. Experiments show that the controller proposed attained superior performance and robustness on classical control and robotics tasks when compared to models not accounting for the uncertainty over model parameters.\n
\n\n\n
\n\n\n
\n
\n\n \n \n \n \n \n \n Active Learning of Conditional Mean Embeddings via Bayesian Optimisation.\n \n \n \n \n\n\n \n Chowdhury, S., R.; Oliveira, R.; and Ramos, F.\n\n\n \n\n\n\n In
Proceedings of the 36th Conference on Uncertainty in Artificial Intelligence (UAI), 2020. PMLR volume 124\n
\n\n
\n\n
\n\n
\n\n \n \n
Website\n \n \n\n \n\n \n link\n \n \n\n bibtex\n \n\n \n\n \n\n \n \n \n \n \n \n \n\n \n \n \n \n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@inproceedings{\n title = {Active Learning of Conditional Mean Embeddings via Bayesian Optimisation},\n type = {inproceedings},\n year = {2020},\n websites = {http://auai.org/uai2020/accepted.php},\n publisher = {PMLR volume 124},\n city = {Toronto, Canada},\n id = {02c0806f-4084-3f0a-bee8-ec0278268799},\n created = {2020-07-01T14:29:49.080Z},\n file_attached = {true},\n profile_id = {20eea928-2566-3876-901d-9a50fe4a71d0},\n last_modified = {2020-07-30T04:56:03.766Z},\n read = {true},\n starred = {false},\n authored = {true},\n confirmed = {true},\n hidden = {false},\n citation_key = {Chowdhury2020},\n private_publication = {false},\n bibtype = {inproceedings},\n author = {Chowdhury, Sayak Ray and Oliveira, Rafael and Ramos, Fabio},\n booktitle = {Proceedings of the 36th Conference on Uncertainty in Artificial Intelligence (UAI)},\n keywords = {Bayesian optimisation,RKHS,approximate inference,kernel embeddings,regret bounds}\n}
\n
\n\n\n\n
\n\n\n
\n\n\n
\n
\n\n \n \n \n \n \n Online BayesSim for combined simulator parameter inference and policy improvement.\n \n \n \n\n\n \n Possas, R.; Barcelos, L.; Oliveira, R.; Fox, D.; and Ramos, F.\n\n\n \n\n\n\n In
IEEE International Conference on Intelligent Robots and Systems, pages 5445-5452, 2020. \n
\n\n
\n\n
\n\n
\n\n \n\n \n \n doi\n \n \n\n \n link\n \n \n\n bibtex\n \n\n \n \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n \n \n \n\n\n\n
\n
@inproceedings{\n title = {Online BayesSim for combined simulator parameter inference and policy improvement},\n type = {inproceedings},\n year = {2020},\n pages = {5445-5452},\n id = {d55ab184-03e1-3508-bc19-d6ec9733af6a},\n created = {2021-04-28T13:24:45.767Z},\n file_attached = {true},\n profile_id = {20eea928-2566-3876-901d-9a50fe4a71d0},\n last_modified = {2022-09-07T14:57:24.871Z},\n read = {true},\n starred = {false},\n authored = {true},\n confirmed = {true},\n hidden = {false},\n citation_key = {Possas2020},\n private_publication = {false},\n abstract = {Recent advancements in Bayesian likelihood-free inference enables a probabilistic treatment for the problem of estimating simulation parameters and their uncertainty given sequences of observations. Domain randomization can be performed much more effectively when a posterior distribution provides the correct uncertainty over parameters in a simulated environment. In this paper, we study the integration of simulation parameter inference with both model-free reinforcement learning and model-based control in a novel sequential algorithm that alternates between learning a better estimation of parameters and improving the controller. This approach exploits the interdependence between the two problems to generate computational efficiencies and improved reliability when a black-box simulator is available. Experimental results suggest that both control strategies have better performance when compared to traditional domain randomization methods.},\n bibtype = {inproceedings},\n author = {Possas, Rafael and Barcelos, Lucas and Oliveira, Rafael and Fox, Dieter and Ramos, Fabio},\n doi = {10.1109/IROS45743.2020.9341401},\n booktitle = {IEEE International Conference on Intelligent Robots and Systems}\n}
\n
\n\n\n
\n Recent advancements in Bayesian likelihood-free inference enables a probabilistic treatment for the problem of estimating simulation parameters and their uncertainty given sequences of observations. Domain randomization can be performed much more effectively when a posterior distribution provides the correct uncertainty over parameters in a simulated environment. In this paper, we study the integration of simulation parameter inference with both model-free reinforcement learning and model-based control in a novel sequential algorithm that alternates between learning a better estimation of parameters and improving the controller. This approach exploits the interdependence between the two problems to generate computational efficiencies and improved reliability when a black-box simulator is available. Experimental results suggest that both control strategies have better performance when compared to traditional domain randomization methods.\n
\n\n\n
\n\n\n
\n
\n\n \n \n \n \n \n No-Regret reinforcement learning with value function approximation: a kernel embedding approach.\n \n \n \n\n\n \n Chowdhury, S., R.; and Oliveira, R.\n\n\n \n\n\n\n
arXiv. 2020.\n
\n\n
\n\n
\n\n
\n\n \n\n \n\n \n link\n \n \n\n bibtex\n \n\n \n \n \n abstract \n \n\n \n\n \n \n \n \n \n \n \n\n \n \n \n \n \n \n \n \n \n\n\n\n
\n
@article{\n title = {No-Regret reinforcement learning with value function approximation: a kernel embedding approach},\n type = {article},\n year = {2020},\n keywords = {Kernel mean embedding,Model-based RL,Value function approximation},\n id = {3b618603-efbe-36ec-9e86-1ef672971abc},\n created = {2021-04-28T13:24:45.964Z},\n file_attached = {true},\n profile_id = {20eea928-2566-3876-901d-9a50fe4a71d0},\n last_modified = {2023-09-26T09:51:20.409Z},\n read = {true},\n starred = {false},\n authored = {true},\n confirmed = {true},\n hidden = {false},\n citation_key = {Chowdhury2020a},\n private_publication = {false},\n abstract = {We consider the regret minimization problem in reinforcement learning (RL) in the episodic setting. In many real-world RL environments, the state and action spaces are continuous or very large. Existing approaches establish regret guarantees by either a low-dimensional representation of the stochastic transition model or an approximation of the Q-functions. However, the understanding of function approximation schemes for state-value functions largely remains missing. In this paper, we propose an online model-based RL algorithm, namely the CME-RL, that learns representations of transition distributions as embeddings in a reproducing kernel Hilbert space while carefully balancing the exploitation-exploration tradeoff. We demonstrate the efficiency of our algorithm by proving a frequentist (worst-case) regret bound that is of order Oõ(HγN√N)1, where H is the episode length, N is the total number of time steps and γN is an information theoretic quantity relating the effective dimension of the state-action feature space. Our method bypasses the need for estimating transition probabilities and applies to any domain on which kernels can be defined. It also brings new insights into the general theory of kernel methods for approximate inference and RL regret minimization.},\n bibtype = {article},\n author = {Chowdhury, Sayak Ray and Oliveira, Rafael},\n journal = {arXiv}\n}
\n
\n\n\n
\n We consider the regret minimization problem in reinforcement learning (RL) in the episodic setting. In many real-world RL environments, the state and action spaces are continuous or very large. Existing approaches establish regret guarantees by either a low-dimensional representation of the stochastic transition model or an approximation of the Q-functions. However, the understanding of function approximation schemes for state-value functions largely remains missing. In this paper, we propose an online model-based RL algorithm, namely the CME-RL, that learns representations of transition distributions as embeddings in a reproducing kernel Hilbert space while carefully balancing the exploitation-exploration tradeoff. We demonstrate the efficiency of our algorithm by proving a frequentist (worst-case) regret bound that is of order Oõ(HγN√N)1, where H is the episode length, N is the total number of time steps and γN is an information theoretic quantity relating the effective dimension of the state-action feature space. Our method bypasses the need for estimating transition probabilities and applies to any domain on which kernels can be defined. It also brings new insights into the general theory of kernel methods for approximate inference and RL regret minimization.\n
\n\n\n
\n\n\n\n\n\n