Universal Value Density Estimation for Imitation Learning and Goal-Conditioned Reinforcement Learning

02/15/2020
by   Yannick Schroecker, et al.
0

This work considers two distinct settings: imitation learning and goal-conditioned reinforcement learning. In either case, effective solutions require the agent to reliably reach a specified state (a goal), or set of states (a demonstration). Drawing a connection between probabilistic long-term dynamics and the desired value function, this work introduces an approach which utilizes recent advances in density estimation to effectively learn to reach a given state. As our first contribution, we use this approach for goal-conditioned reinforcement learning and show that it is both efficient and does not suffer from hindsight bias in stochastic domains. As our second contribution, we extend the approach to imitation learning and show that it achieves state-of-the art demonstration sample-efficiency on standard benchmark tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/12/2019

Learning To Reach Goals Without Reinforcement Learning

Imitation learning algorithms provide a simple and straightforward appro...
research
04/29/2023

A Coupled Flow Approach to Imitation Learning

In reinforcement learning and imitation learning, an object of central i...
research
08/24/2023

Conditional Kernel Imitation Learning for Continuous State Environments

Imitation Learning (IL) is an important paradigm within the broader rein...
research
06/13/2019

Goal-conditioned Imitation Learning

Designing rewards for Reinforcement Learning (RL) is challenging because...
research
09/26/2022

Understanding Hindsight Goal Relabeling Requires Rethinking Divergence Minimization

Hindsight goal relabeling has become a foundational technique for multi-...
research
04/05/2023

Goal-Conditioned Imitation Learning using Score-based Diffusion Policies

We propose a new policy representation based on score-based diffusion mo...
research
07/09/2019

Hybrid system identification using switching density networks

Behaviour cloning is a commonly used strategy for imitation learning and...

Please sign up or login with your details

Forgot password? Click here to reset