Discovering and Achieving Goals via World Models

10/18/2021
by   Russell Mendonca, et al.
8

How can artificial agents learn to solve many diverse tasks in complex visual environments in the absence of any supervision? We decompose this question into two problems: discovering new goals and learning to reliably achieve them. We introduce Latent Explorer Achiever (LEXA), a unified solution to these that learns a world model from image inputs and uses it to train an explorer and an achiever policy from imagined rollouts. Unlike prior methods that explore by reaching previously visited states, the explorer plans to discover unseen surprising states through foresight, which are then used as diverse targets for the achiever to practice. After the unsupervised phase, LEXA solves tasks specified as goal images zero-shot without any additional learning. LEXA substantially outperforms previous approaches to unsupervised goal-reaching, both on prior benchmarks and on a new challenging benchmark with a total of 40 test tasks spanning across four standard robotic manipulation and locomotion domains. LEXA further achieves goals that require interacting with multiple objects in sequence. Finally, to demonstrate the scalability and generality of LEXA, we train a single general agent across four distinct environments. Code and videos at https://orybkin.github.io/lexa/

READ FULL TEXT

page 2

page 5

page 8

page 15

page 16

page 17

page 18

page 19

research
01/13/2021

Asymmetric self-play for automatic goal discovery in robotic manipulation

We train a single, goal-conditioned policy that can solve many robotic m...
research
02/02/2022

Lipschitz-constrained Unsupervised Skill Discovery

We study the problem of unsupervised skill discovery, whose goal is to l...
research
07/12/2018

Visual Reinforcement Learning with Imagined Goals

For an autonomous agent to fulfill a wide range of user-specified goals ...
research
06/07/2022

How Far I'll Go: Offline Goal-Conditioned Reinforcement Learning via f-Advantage Regression

Offline goal-conditioned reinforcement learning (GCRL) promises general-...
research
04/23/2018

Zero-Shot Visual Imitation

The current dominant paradigm for imitation learning relies on strong su...
research
04/12/2021

RECON: Rapid Exploration for Open-World Navigation with Latent Goal Models

We describe a robotic learning system for autonomous navigation in diver...
research
09/12/2019

Hierarchical Foresight: Self-Supervised Learning of Long-Horizon Tasks via Visual Subgoal Generation

Video prediction models combined with planning algorithms have shown pro...

Please sign up or login with your details

Forgot password? Click here to reset