C-Learning: Learning to Achieve Goals via Recursive Classification

11/17/2020
by   Benjamin Eysenbach, et al.
12

We study the problem of predicting and controlling the future state distribution of an autonomous agent. This problem, which can be viewed as a reframing of goal-conditioned reinforcement learning (RL), is centered around learning a conditional probability density function over future states. Instead of directly estimating this density function, we indirectly estimate this density function by training a classifier to predict whether an observation comes from the future. Via Bayes' rule, predictions from our classifier can be transformed into predictions over future states. Importantly, an off-policy variant of our algorithm allows us to predict the future state distribution of a new policy, without collecting new experience. This variant allows us to optimize functionals of a policy's future state distribution, such as the density of reaching a particular goal state. While conceptually similar to Q-learning, our work lays a principled foundation for goal-conditioned RL as density estimation, providing justification for goal-conditioned methods used in prior work. This foundation makes hypotheses about Q-learning, including the optimal goal-sampling ratio, which we confirm experimentally. Moreover, our proposed method is competitive with prior goal-conditioned RL methods.

READ FULL TEXT

page 9

page 20

page 24

page 25

research
10/22/2021

C-Planning: An Automatic Curriculum for Learning Goal-Reaching Tasks

Goal-conditioned reinforcement learning (RL) can solve tasks in a wide r...
research
01/20/2022

Goal-Conditioned Reinforcement Learning: Problems and Solutions

Goal-conditioned reinforcement learning (GCRL), related to a set of comp...
research
04/27/2022

Bisimulation Makes Analogies in Goal-Conditioned Reinforcement Learning

Building generalizable goal-conditioned agents from rich observations is...
research
07/22/2023

HIQL: Offline Goal-Conditioned RL with Latent States as Actions

Unsupervised pre-training has recently become the bedrock for computer v...
research
06/21/2022

Meta Reinforcement Learning with Finite Training Tasks – a Density Estimation Approach

In meta reinforcement learning (meta RL), an agent learns from a set of ...
research
10/08/2020

Set Prediction without Imposing Structure as Conditional Density Estimation

Set prediction is about learning to predict a collection of unordered va...
research
06/02/2021

Variational Empowerment as Representation Learning for Goal-Based Reinforcement Learning

Learning to reach goal states and learning diverse skills through mutual...

Please sign up or login with your details

Forgot password? Click here to reset