State Representation Learning for Goal-Conditioned Reinforcement Learning

05/04/2022
by   Lorenzo Steccanella, et al.
0

This paper presents a novel state representation for reward-free Markov decision processes. The idea is to learn, in a self-supervised manner, an embedding space where distances between pairs of embedded states correspond to the minimum number of actions needed to transition between them. Compared to previous methods, our approach does not require any domain knowledge, learning from offline and unlabeled data. We show how this representation can be leveraged to learn goal-conditioned policies, providing a notion of similarity between states and goals and a useful heuristic distance to guide planning and reinforcement learning algorithms. Finally, we empirically validate our method in classic control domains and multi-goal environments, demonstrating that our method can successfully learn representations in large and/or continuous domains.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/05/2019

Self-supervised Learning of Distance Functions for Goal-Conditioned Reinforcement Learning

Goal-conditioned policies are used in order to break down complex reinfo...
research
12/30/2020

Model-Based Visual Planning with Self-Supervised Functional Distances

A generalist robot must be able to complete a variety of tasks in its en...
research
05/07/2020

Plan2Vec: Unsupervised Representation Learning by Latent Plans

In this paper we introduce plan2vec, an unsupervised representation lear...
research
07/22/2023

HIQL: Offline Goal-Conditioned RL with Latent States as Actions

Unsupervised pre-training has recently become the bedrock for computer v...
research
04/30/2020

Plan-Space State Embeddings for Improved Reinforcement Learning

Robot control problems are often structured with a policy function that ...
research
02/28/2022

Weakly Supervised Disentangled Representation for Goal-conditioned Reinforcement Learning

Goal-conditioned reinforcement learning is a crucial yet challenging alg...
research
01/31/2020

Domain-Adversarial and -Conditional State Space Model for Imitation Learning

State representation learning (SRL) in partially observable Markov decis...

Please sign up or login with your details

Forgot password? Click here to reset