Wasserstein Adversarial Imitation Learning

06/19/2019
by   Huang Xiao, et al.
3

Imitation Learning describes the problem of recovering an expert policy from demonstrations. While inverse reinforcement learning approaches are known to be very sample-efficient in terms of expert demonstrations, they usually require problem-dependent reward functions or a (task-)specific reward-function regularization. In this paper, we show a natural connection between inverse reinforcement learning approaches and Optimal Transport, that enables more general reward functions with desirable properties (e.g., smoothness). Based on our observation, we propose a novel approach called Wasserstein Adversarial Imitation Learning. Our approach considers the Kantorovich potentials as a reward function and further leverages regularized optimal transport to enable large-scale applications. In several robotic experiments, our approach outperforms the baselines in terms of average cumulative rewards and shows a significant improvement in sample-efficiency, by requiring just one expert demonstration.

READ FULL TEXT

page 8

page 16

page 17

page 18

research
05/03/2020

Off-Policy Adversarial Inverse Reinforcement Learning

Adversarial Imitation Learning (AIL) is a class of algorithms in Reinfor...
research
06/30/2022

Watch and Match: Supercharging Imitation with Regularized Optimal Transport

Imitation learning holds tremendous promise in learning policies efficie...
research
11/02/2020

Shaping Rewards for Reinforcement Learning with Imperfect Demonstrations using Generative Models

The potential benefits of model-free reinforcement learning to real robo...
research
06/23/2022

Learning Agile Skills via Adversarial Imitation of Rough Partial Demonstrations

Learning agile skills is one of the main challenges in robotics. To this...
research
03/01/2023

LS-IQ: Implicit Reward Regularization for Inverse Reinforcement Learning

Recent methods for imitation learning directly learn a Q-function using ...
research
11/24/2022

Discovering Generalizable Spatial Goal Representations via Graph-based Active Reward Learning

In this work, we consider one-shot imitation learning for object rearran...
research
08/20/2020

Imitation Learning with Sinkhorn Distances

Imitation learning algorithms have been interpreted as variants of diver...

Please sign up or login with your details

Forgot password? Click here to reset