Imagined Value Gradients: Model-Based Policy Optimization with Transferable Latent Dynamics Models

10/09/2019
by   Arunkumar Byravan, et al.
5

Humans are masters at quickly learning many complex tasks, relying on an approximate understanding of the dynamics of their environments. In much the same way, we would like our learning agents to quickly adapt to new tasks. In this paper, we explore how model-based Reinforcement Learning (RL) can facilitate transfer to new tasks. We develop an algorithm that learns an action-conditional, predictive model of expected future observations, rewards and values from which a policy can be derived by following the gradient of the estimated value along imagined trajectories. We show how robust policy optimization can be achieved in robot manipulation tasks even with approximate models that are learned directly from vision and proprioception. We evaluate the efficacy of our approach in a transfer learning scenario, re-using previously learned models on tasks with different reward structures and visual distractors, and show a significant improvement in learning speed compared to strong off-policy baselines. Videos with results can be found at https://sites.google.com/view/ivg-corl19

READ FULL TEXT

page 6

page 16

page 19

page 23

research
08/28/2018

SOLAR: Deep Structured Latent Representations for Model-Based Reinforcement Learning

Model-based reinforcement learning (RL) methods can be broadly categoriz...
research
02/21/2020

Estimating Q(s,s') with Deep Deterministic Dynamics Gradients

In this paper, we introduce a novel form of value function, Q(s, s'), th...
research
09/25/2018

Floyd-Warshall Reinforcement Learning Learning from Past Experiences to Reach New Goals

Consider mutli-goal tasks that involve static environments and dynamic g...
research
02/14/2020

Learning Functionally Decomposed Hierarchies for Continuous Control Tasks

Solving long-horizon sequential decision making tasks in environments wi...
research
05/21/2020

Guided Uncertainty-Aware Policy Optimization: Combining Learning and Model-Based Strategies for Sample-Efficient Policy Learning

Traditional robotic approaches rely on an accurate model of the environm...
research
02/08/2020

Generalized Hidden Parameter MDPs Transferable Model-based RL in a Handful of Trials

There is broad interest in creating RL agents that can solve many (relat...
research
02/08/2023

Investigating the role of model-based learning in exploration and transfer

State of the art reinforcement learning has enabled training agents on t...

Please sign up or login with your details

Forgot password? Click here to reset