oIRL: Robust Adversarial Inverse Reinforcement Learning with Temporally Extended Actions

02/20/2020
by   David Venuto, et al.
0

Explicit engineering of reward functions for given environments has been a major hindrance to reinforcement learning methods. While Inverse Reinforcement Learning (IRL) is a solution to recover reward functions from demonstrations only, these learned rewards are generally heavily entangled with the dynamics of the environment and therefore not portable or robust to changing environments. Modern adversarial methods have yielded some success in reducing reward entanglement in the IRL setting. In this work, we leverage one such method, Adversarial Inverse Reinforcement Learning (AIRL), to propose an algorithm that learns hierarchical disentangled rewards with a policy over options. We show that this method has the ability to learn generalizable policies and reward functions in complex transfer learning tasks, while yielding results in continuous control benchmarks that are comparable to those of the state-of-the-art methods.

READ FULL TEXT

page 6

page 7

research
03/28/2023

BC-IRL: Learning Generalizable Reward Functions from Demonstrations

How well do reward functions learned with inverse reinforcement learning...
research
09/17/2018

Adversarial Imitation via Variational Inverse Reinforcement Learning

We consider a problem of learning a reward and policy from expert exampl...
research
05/21/2019

Stochastic Inverse Reinforcement Learning

Inverse reinforcement learning (IRL) is an ill-posed inverse problem sin...
research
12/09/2019

Adversarial recovery of agent rewards from latent spaces of the limit order book

Inverse reinforcement learning has proved its ability to explain state-a...
research
09/25/2022

Temporally Extended Successor Representations

We present a temporally extended variation of the successor representati...
research
07/07/2021

Learning Time-Invariant Reward Functions through Model-Based Inverse Reinforcement Learning

Inverse reinforcement learning is a paradigm motivated by the goal of le...
research
05/29/2018

Variational Inverse Control with Events: A General Framework for Data-Driven Reward Definition

The design of a reward function often poses a major practical challenge ...

Please sign up or login with your details

Forgot password? Click here to reset