Learning Dynamics Model in Reinforcement Learning by Incorporating the Long Term Future

03/05/2019
by   Nan Rosemary Ke, et al.
14

In model-based reinforcement learning, the agent interleaves between model learning and planning. These two components are inextricably intertwined. If the model is not able to provide sensible long-term prediction, the executed planner would exploit model flaws, which can yield catastrophic failures. This paper focuses on building a model that reasons about the long-term future and demonstrates how to use this for efficient planning and exploration. To this end, we build a latent-variable autoregressive model by leveraging recent ideas in variational inference. We argue that forcing latent variables to carry future information through an auxiliary task substantially improves long-term predictions. Moreover, by planning in the latent space, the planner's solution is ensured to be within regions where the model is valid. An exploration strategy can be devised by searching for unlikely trajectories under the model. Our method achieves higher reward faster compared to baselines on a variety of tasks and environments in both the imitation learning and model-based reinforcement learning settings.

READ FULL TEXT
research
12/16/2020

Learning Accurate Long-term Dynamics for Model-based Reinforcement Learning

Accurately predicting the dynamics of robotic systems is crucial for mod...
research
06/24/2021

Model-Based Reinforcement Learning via Latent-Space Collocation

The ability to plan into the future while utilizing only raw high-dimens...
research
06/25/2021

Predictive Control Using Learned State Space Models via Rolling Horizon Evolution

A large part of the interest in model-based reinforcement learning deriv...
research
11/06/2019

MBCAL: A Simple and Efficient Reinforcement Learning Method for Recommendation Systems

It has been widely regarded that only considering the immediate user fee...
research
10/17/2020

Variational Dynamic for Self-Supervised Exploration in Deep Reinforcement Learning

Efficient exploration remains a challenging problem in reinforcement lea...
research
10/27/2020

Learning to Plan Optimistically: Uncertainty-Guided Deep Exploration via Latent Model Ensembles

Learning complex behaviors through interaction requires coordinated long...
research
03/19/2020

Learning to Fly via Deep Model-Based Reinforcement Learning

Learning to control robots without requiring models has been a long-term...

Please sign up or login with your details

Forgot password? Click here to reset