CWAE-IRL: Formulating a supervised approach to Inverse Reinforcement Learning problem

10/02/2019
by   Arpan Kusari, et al.
0

Inverse reinforcement learning (IRL) is used to infer the reward function from the actions of an expert running a Markov Decision Process (MDP). A novel approach using variational inference for learning the reward function is proposed in this research. Using this technique, the intractable posterior distribution of the continuous latent variable (the reward function in this case) is analytically approximated to appear to be as close to the prior belief while trying to reconstruct the future state conditioned on the current state and action. The reward function is derived using a well-known deep generative model known as Conditional Variational Auto-encoder (CVAE) with Wasserstein loss function, thus referred to as Conditional Wasserstein Auto-encoder-IRL (CWAE-IRL), which can be analyzed as a combination of the backward and forward inference. This can then form an efficient alternative to the previous approaches to IRL while having no knowledge of the system dynamics of the agent. Experimental results on standard benchmarks such as objectworld and pendulum show that the proposed algorithm can effectively learn the latent reward function in complex, high-dimensional environments.

READ FULL TEXT
research
06/07/2021

Identifiability in inverse reinforcement learning

Inverse reinforcement learning attempts to reconstruct the reward functi...
research
06/18/2012

Continuous Inverse Optimal Control with Locally Optimal Examples

Inverse optimal control, also known as inverse reinforcement learning, i...
research
04/13/2016

Inverse Reinforcement Learning with Simultaneous Estimation of Rewards and Dynamics

Inverse Reinforcement Learning (IRL) describes the problem of learning a...
research
12/10/2019

Deep Bayesian Reward Learning from Preferences

Bayesian inverse reinforcement learning (IRL) methods are ideal for safe...
research
07/13/2023

Safe Reinforcement Learning as Wasserstein Variational Inference: Formal Methods for Interpretability

Reinforcement Learning or optimal control can provide effective reasonin...
research
02/22/2017

Counterfactual Control for Free from Generative Models

We introduce a method by which a generative model learning the joint dis...
research
10/11/2017

Specification Inference from Demonstrations

Learning from expert demonstrations has received a lot of attention in a...

Please sign up or login with your details

Forgot password? Click here to reset