LEAF: Latent Exploration Along the Frontier

05/21/2020
by   Homanga Bharadhwaj, et al.
0

Self-supervised goal proposal and reaching is a key component for exploration and efficient policy learning algorithms. Such a self-supervised approach without access to any oracle goal sampling distribution requires deep exploration and commitment so that long horizon plans can be efficiently discovered. In this paper, we propose an exploration framework, which learns a dynamics-aware manifold of reachable states. For a goal, our proposed method deterministically visits a state at the current frontier of reachable states (commitment/reaching) and then stochastically explores to reach the goal (exploration). This allocates exploration budget near the frontier of the reachable region instead of its interior. We target the challenging problem of policy learning from initial and goal states specified as images, and do not assume any access to the underlying ground-truth states of the robot and the environment. To keep track of reachable latent states, we propose a distance-conditioned reachability network that is trained to infer whether one state is reachable from another within the specified latent space distance. Given an initial state, we obtain a frontier of reachable states from that state. By incorporating a curriculum for sampling easier goals (closer to the start state) before more difficult goals, we demonstrate that the proposed self-supervised exploration algorithm, can achieve 20% superior performance on average compared to existing baselines on a set of challenging robotic environments, including on a real robot manipulation task.

READ FULL TEXT
research
05/21/2020

Dynamics-Aware Latent Space Reachability for Exploration in Temporally-Extended Tasks

Self-supervised goal proposal and reaching is a key component of efficie...
research
03/23/2023

Planning Goals for Exploration

Dropped into an unknown environment, what should an agent do to quickly ...
research
03/08/2019

Skew-Fit: State-Covering Self-Supervised Reinforcement Learning

In standard reinforcement learning, each new skill requires a manually-d...
research
03/09/2023

GOATS: Goal Sampling Adaptation for Scooping with Curriculum Reinforcement Learning

In this work, we first formulate the problem of goal-conditioned robotic...
research
11/07/2022

C3PO: Learning to Achieve Arbitrary Goals via Massively Entropic Pretraining

Given a particular embodiment, we propose a novel method (C3PO) that lea...
research
10/25/2021

Goal-Aware Cross-Entropy for Multi-Target Reinforcement Learning

Learning in a multi-target environment without prior knowledge about the...
research
08/15/2019

Mapping State Space using Landmarks for Universal Goal Reaching

An agent that has well understood the environment should be able to appl...

Please sign up or login with your details

Forgot password? Click here to reset