Model-Based Visual Planning with Self-Supervised Functional Distances

12/30/2020
by   Stephen Tian, et al.
13

A generalist robot must be able to complete a variety of tasks in its environment. One appealing way to specify each task is in terms of a goal observation. However, learning goal-reaching policies with reinforcement learning remains a challenging problem, particularly when hand-engineered reward functions are not available. Learned dynamics models are a promising approach for learning about the environment without rewards or task-directed data, but planning to reach goals with such a model requires a notion of functional similarity between observations and goal states. We present a self-supervised method for model-based visual goal reaching, which uses both a visual dynamics model as well as a dynamical distance function learned using model-free reinforcement learning. Our approach learns entirely using offline, unlabeled data, making it practical to scale to large and diverse datasets. In our experiments, we find that our method can successfully learn models that perform a variety of tasks at test-time, moving objects amid distractors with a simulated robotic arm and even learning to open and close a drawer using a real-world robot. In comparisons, we find that this approach substantially outperforms both model-free and model-based prior methods. Videos and visualizations are available here: http://sites.google.com/berkeley.edu/mbold.

READ FULL TEXT
research
07/18/2019

Dynamical Distance Learning for Unsupervised and Semi-Supervised Skill Discovery

Reinforcement learning requires manual specification of a reward functio...
research
07/14/2020

Goal-Aware Prediction: Learning to Model What Matters

Learned dynamics models combined with both planning and policy learning ...
research
03/31/2017

Learning Visual Servoing with Deep Features and Fitted Q-Iteration

Visual servoing involves choosing actions that move a robot in response ...
research
09/29/2017

Self-supervised Deep Reinforcement Learning with Generalized Computation Graphs for Robot Navigation

Enabling robots to autonomously navigate complex environments is essenti...
research
05/04/2022

State Representation Learning for Goal-Conditioned Reinforcement Learning

This paper presents a novel state representation for reward-free Markov ...
research
06/22/2023

Learning from Visual Observation via Offline Pretrained State-to-Go Transformer

Learning from visual observation (LfVO), aiming at recovering policies f...
research
09/30/2018

Few-Shot Goal Inference for Visuomotor Learning and Planning

Reinforcement learning and planning methods require an objective or rewa...

Please sign up or login with your details

Forgot password? Click here to reset