Goal-Conditioned Reinforcement Learning with Disentanglement-based Reachability Planning

07/20/2023
by   Zhifeng Qian, et al.
0

Goal-Conditioned Reinforcement Learning (GCRL) can enable agents to spontaneously set diverse goals to learn a set of skills. Despite the excellent works proposed in various fields, reaching distant goals in temporally extended tasks remains a challenge for GCRL. Current works tackled this problem by leveraging planning algorithms to plan intermediate subgoals to augment GCRL. Their methods need two crucial requirements: (i) a state representation space to search valid subgoals, and (ii) a distance function to measure the reachability of subgoals. However, they struggle to scale to high-dimensional state space due to their non-compact representations. Moreover, they cannot collect high-quality training data through standard GC policies, which results in an inaccurate distance function. Both affect the efficiency and performance of planning and policy learning. In the paper, we propose a goal-conditioned RL algorithm combined with Disentanglement-based Reachability Planning (REPlan) to solve temporally extended tasks. In REPlan, a Disentangled Representation Module (DRM) is proposed to learn compact representations which disentangle robot poses and object positions from high-dimensional observations in a self-supervised manner. A simple REachability discrimination Module (REM) is also designed to determine the temporal distance of subgoals. Moreover, REM computes intrinsic bonuses to encourage the collection of novel states for training. We evaluate our REPlan in three vision-based simulation tasks and one real-world task. The experiments demonstrate that our REPlan significantly outperforms the prior state-of-the-art methods in solving temporally extended tasks.

READ FULL TEXT

page 1

page 6

page 7

page 8

research
10/22/2021

C-Planning: An Automatic Curriculum for Learning Goal-Reaching Tasks

Goal-conditioned reinforcement learning (RL) can solve tasks in a wide r...
research
07/05/2019

Self-supervised Learning of Distance Functions for Goal-Conditioned Reinforcement Learning

Goal-conditioned policies are used in order to break down complex reinfo...
research
05/21/2020

Dynamics-Aware Latent Space Reachability for Exploration in Temporally-Extended Tasks

Self-supervised goal proposal and reaching is a key component of efficie...
research
02/28/2022

Weakly Supervised Disentangled Representation for Goal-conditioned Reinforcement Learning

Goal-conditioned reinforcement learning is a crucial yet challenging alg...
research
11/19/2019

Planning with Goal-Conditioned Policies

Planning methods can solve temporally extended sequential decision makin...
research
04/21/2022

Planning for Temporally Extended Goals in Pure-Past Linear Temporal Logic: A Polynomial Reduction to Standard Planning

We study temporally extended goals expressed in Pure-Past LTL (PPLTL). P...
research
10/12/2022

Generalization with Lossy Affordances: Leveraging Broad Offline Data for Learning Visuomotor Tasks

The utilization of broad datasets has proven to be crucial for generaliz...

Please sign up or login with your details

Forgot password? Click here to reset