Uncertainty-aware Model-based Policy Optimization

06/25/2019
by   Tung-Long Vuong, et al.
0

Model-based reinforcement learning has the potential to be more sample efficient than model-free approaches. However, existing model-based methods are vulnerable to model bias, which leads to poor generalization and asymptotic performance compared to model-free counterparts. In addition, they are typically based on the model predictive control (MPC) framework, which not only is computationally inefficient at decision time but also does not enable policy transfer due to the lack of an explicit policy representation. In this paper, we propose a novel uncertainty-aware model-based policy optimization framework which solves those issues. In this framework, the agent simultaneously learns an uncertainty-aware dynamics model and optimizes the policy according to these learned models. In the optimization step, the policy gradient is computed by automatic differentiation through the models. With respect to sample efficiency alone, our approach shows promising results on challenging continuous control benchmarks with competitive asymptotic performance and significantly lower sample complexity than state-of-the-art baselines.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/28/2019

Deep Model-Based Reinforcement Learning via Estimated Uncertainty and Conservative Policy Optimization

Model-based reinforcement learning algorithms tend to achieve higher sam...
research
04/13/2021

Muesli: Combining Improvements in Policy Optimization

We propose a novel policy update that combines regularized policy optimi...
research
05/15/2019

Reinforcement Learning for Robotics and Control with Active Uncertainty Reduction

Model-free reinforcement learning based methods such as Proximal Policy ...
research
05/21/2020

Guided Uncertainty-Aware Policy Optimization: Combining Learning and Model-Based Strategies for Sample-Efficient Policy Learning

Traditional robotic approaches rely on an accurate model of the environm...
research
06/12/2020

Combining Model-Based and Model-Free Methods for Nonlinear Control: A Provably Convergent Policy Gradient Approach

Model-free learning-based control methods have seen great success recent...
research
11/17/2021

Aggressive Q-Learning with Ensembles: Achieving Both High Sample Efficiency and High Asymptotic Performance

Recently, Truncated Quantile Critics (TQC), using distributional represe...
research
04/16/2020

A Game Theoretic Framework for Model Based Reinforcement Learning

Model-based reinforcement learning (MBRL) has recently gained immense in...

Please sign up or login with your details

Forgot password? Click here to reset