Off-Policy Adversarial Inverse Reinforcement Learning

05/03/2020
by   Samin Yeasar Arnob, et al.
0

Adversarial Imitation Learning (AIL) is a class of algorithms in Reinforcement learning (RL), which tries to imitate an expert without taking any reward from the environment and does not provide expert behavior directly to the policy training. Rather, an agent learns a policy distribution that minimizes the difference from expert behavior in an adversarial setting. Adversarial Inverse Reinforcement Learning (AIRL) leverages the idea of AIL, integrates a reward function approximation along with learning the policy, and shows the utility of IRL in the transfer learning setting. But the reward function approximator that enables transfer learning does not perform well in imitation tasks. We propose an Off-Policy Adversarial Inverse Reinforcement Learning (Off-policy-AIRL) algorithm which is sample efficient as well as gives good imitation performance compared to the state-of-the-art AIL algorithm in the continuous control tasks. For the same reward function approximator, we show the utility of learning our algorithm over AIL by using the learned reward function to retrain the policy over a task under significant variation where expert demonstrations are absent.

READ FULL TEXT
research
05/16/2019

Random Expert Distillation: Imitation Learning via Expert Policy Support Estimation

We consider the problem of imitation learning from a finite set of exper...
research
09/09/2018

Addressing Sample Inefficiency and Reward Bias in Inverse Reinforcement Learning

The Generative Adversarial Imitation Learning (GAIL) framework from Ho &...
research
02/04/2021

Hybrid Adversarial Inverse Reinforcement Learning

In this paper, we investigate the problem of the inverse reinforcement l...
research
06/02/2023

PAGAR: Imitation Learning with Protagonist Antagonist Guided Adversarial Reward

Imitation learning (IL) algorithms often rely on inverse reinforcement l...
research
11/09/2020

f-IRL: Inverse Reinforcement Learning via State Marginal Matching

Imitation learning is well-suited for robotic tasks where it is difficul...
research
06/19/2019

Wasserstein Adversarial Imitation Learning

Imitation Learning describes the problem of recovering an expert policy ...
research
02/12/2022

Robust Learning from Observation with Model Misspecification

Imitation learning (IL) is a popular paradigm for training policies in r...

Please sign up or login with your details

Forgot password? Click here to reset