Identifiability in inverse reinforcement learning

06/07/2021
by   Haoyang Cao, et al.
0

Inverse reinforcement learning attempts to reconstruct the reward function in a Markov decision problem, using observations of agent actions. As already observed by Russell the problem is ill-posed, and the reward function is not identifiable, even under the presence of perfect information about optimal behavior. We provide a resolution to this non-identifiability for problems with entropy regularization. For a given environment, we fully characterize the reward functions leading to a given policy and demonstrate that, given demonstrations of actions for the same reward under two distinct discount factors, or under sufficiently different environments, the unobserved reward can be recovered up to a constant. Through a simple numerical experiment, we demonstrate the accurate reconstruction of the reward function through our proposed resolution.

READ FULL TEXT

page 12

page 13

page 16

page 17

research
09/22/2022

Identifiability and generalizability from multiple experts in Inverse Reinforcement Learning

While Reinforcement Learning (RL) aims to train an agent from a reward f...
research
11/17/2020

Efficient Exploration of Reward Functions in Inverse Reinforcement Learning via Bayesian Optimization

The problem of inverse reinforcement learning (IRL) is relevant to a var...
research
10/02/2019

CWAE-IRL: Formulating a supervised approach to Inverse Reinforcement Learning problem

Inverse reinforcement learning (IRL) is used to infer the reward functio...
research
12/15/2017

Inverse Reinforce Learning with Nonparametric Behavior Clustering

Inverse Reinforcement Learning (IRL) is the task of learning a single re...
research
03/13/2023

Kernel Density Bayesian Inverse Reinforcement Learning

Inverse reinforcement learning (IRL) is a powerful framework to infer an...
research
05/25/2021

Trajectory Modeling via Random Utility Inverse Reinforcement Learning

We consider the problem of modeling trajectories of drivers in a road ne...
research
09/22/2017

Inverse Reinforcement Learning with Conditional Choice Probabilities

We make an important connection to existing results in econometrics to d...

Please sign up or login with your details

Forgot password? Click here to reset