DeepAI AI Chat
Log In Sign Up

Goal-conditioned Imitation Learning

06/13/2019
by   Yiming Ding, et al.
berkeley college
Intel
11

Designing rewards for Reinforcement Learning (RL) is challenging because it needs to convey the desired task, be efficient to optimize, and be easy to compute. The latter is particularly problematic when applying RL to robotics, where detecting whether the desired configuration is reached might require considerable supervision and instrumentation. Furthermore, we are often interested in being able to reach a wide range of configurations, hence setting up a different reward every time might be unpractical. Methods like Hindsight Experience Replay (HER) have recently shown promise to learn policies able to reach many goals, without the need of a reward. Unfortunately, without tricks like resetting to points along the trajectory, HER might take a very long time to discover how to reach certain areas of the state-space. In this work we investigate different approaches to incorporate demonstrations to drastically speed up the convergence to a policy able to reach any goal, also surpassing the performance of an agent trained with other Imitation Learning algorithms. Furthermore, our method can be used when only trajectories without expert actions are available, which can leverage kinestetic or third person demonstration. The code is available at https://sites.google.com/view/goalconditioned-il/ .

READ FULL TEXT

page 4

page 6

page 8

12/12/2019

Learning To Reach Goals Without Reinforcement Learning

Imitation learning algorithms provide a simple and straightforward appro...
03/06/2017

Third-Person Imitation Learning

Reinforcement learning (RL) makes it possible to train agents capable of...
02/15/2020

Universal Value Density Estimation for Imitation Learning and Goal-Conditioned Reinforcement Learning

This work considers two distinct settings: imitation learning and goal-c...
09/26/2022

Understanding Hindsight Goal Relabeling Requires Rethinking Divergence Minimization

Hindsight goal relabeling has become a foundational technique for multi-...
09/19/2023

Guide Your Agent with Adaptive Multimodal Rewards

Developing an agent capable of adapting to unseen environments remains a...
09/05/2023

A Survey of Imitation Learning: Algorithms, Recent Developments, and Challenges

In recent years, the development of robotics and artificial intelligence...