Neural Network Dynamics for Model-Based Deep Reinforcement Learning with Model-Free Fine-Tuning

08/08/2017
by   Anusha Nagabandi, et al.
0

Model-free deep reinforcement learning algorithms have been shown to be capable of learning a wide range of robotic skills, but typically require a very large number of samples to achieve good performance. Model-based algorithms, in principle, can provide for much more efficient learning, but have proven difficult to extend to expressive, high-capacity models such as deep neural networks. In this work, we demonstrate that medium-sized neural network models can in fact be combined with model predictive control (MPC) to achieve excellent sample complexity in a model-based reinforcement learning algorithm, producing stable and plausible gaits to accomplish various complex locomotion tasks. We also propose using deep neural network dynamics models to initialize a model-free learner, in order to combine the sample efficiency of model-based approaches with the high task-specific performance of model-free methods. We empirically demonstrate on MuJoCo locomotion tasks that our pure model-based approach trained on just random action data can follow arbitrary trajectories with excellent sample efficiency, and that our hybrid algorithm can accelerate model-free learning on high-speed benchmark tasks, achieving sample efficiency gains of 3-5x on swimmer, cheetah, hopper, and ant agents. Videos can be found at https://sites.google.com/view/mbmf

READ FULL TEXT

page 1

page 5

page 6

research
07/04/2018

Sample-Efficient Reinforcement Learning with Stochastic Ensemble Value Expansion

Integrating model-free and model-based approaches in reinforcement learn...
research
10/07/2021

Evaluating model-based planning and planner amortization for continuous control

There is a widespread intuition that model-based control methods should ...
research
03/09/2022

SAGE: Generating Symbolic Goals for Myopic Models in Deep Reinforcement Learning

Model-based reinforcement learning algorithms are typically more sample ...
research
05/29/2023

Perimeter Control Using Deep Reinforcement Learning: A Model-free Approach towards Homogeneous Flow Rate Optimization

Perimeter control maintains high traffic efficiency within protected reg...
research
10/22/2020

Optimising Stochastic Routing for Taxi Fleets with Model Enhanced Reinforcement Learning

The future of mobility-as-a-Service (Maas)should embrace an integrated s...
research
06/08/2020

Maximum Entropy Model Rollouts: Fast Model Based Policy Optimization without Compounding Errors

Model usage is the central challenge of model-based reinforcement learni...
research
05/09/2017

Deep Episodic Value Iteration for Model-based Meta-Reinforcement Learning

We present a new deep meta reinforcement learner, which we call Deep Epi...

Please sign up or login with your details

Forgot password? Click here to reset