f-IRL: Inverse Reinforcement Learning via State Marginal Matching

11/09/2020
by   Tianwei Ni, et al.
0

Imitation learning is well-suited for robotic tasks where it is difficult to directly program the behavior or specify a cost for optimal control. In this work, we propose a method for learning the reward function (and the corresponding policy) to match the expert state density. Our main result is the analytic gradient of any f-divergence between the agent and expert state distribution w.r.t. reward parameters. Based on the derived gradient, we present an algorithm, f-IRL, that recovers a stationary reward function from the expert density by gradient descent. We show that f-IRL can learn behaviors from a hand-designed target state density or implicitly through expert observations. Our method outperforms adversarial imitation learning methods in terms of sample efficiency and the required number of expert trajectories on IRL benchmarks. Moreover, we show that the recovered reward function can be used to quickly solve downstream tasks, and empirically demonstrate its utility on hard-to-explore tasks and for behavior transfer across changes in dynamics.

READ FULL TEXT

page 6

page 8

page 21

research
05/03/2020

Off-Policy Adversarial Inverse Reinforcement Learning

Adversarial Imitation Learning (AIL) is a class of algorithms in Reinfor...
research
09/02/2022

TarGF: Learning Target Gradient Field for Object Rearrangement

Object Rearrangement is to move objects from an initial state to a goal ...
research
10/20/2022

Robust Imitation via Mirror Descent Inverse Reinforcement Learning

Recently, adversarial imitation learning has shown a scalable reward acq...
research
06/08/2020

Primal Wasserstein Imitation Learning

Imitation Learning (IL) methods seek to match the behavior of an agent w...
research
03/01/2023

LS-IQ: Implicit Reward Regularization for Inverse Reinforcement Learning

Recent methods for imitation learning directly learn a Q-function using ...
research
06/20/2012

Apprenticeship Learning using Inverse Reinforcement Learning and Gradient Methods

In this paper we propose a novel gradient algorithm to learn a policy fr...
research
12/30/2019

A New Framework for Query Efficient Active Imitation Learning

We seek to align agent policy with human expert behavior in a reinforcem...

Please sign up or login with your details

Forgot password? Click here to reset