DeepAI AI Chat
Log In Sign Up

Inverse Reinforcement Learning with Conditional Choice Probabilities

by   Mohit Sharma, et al.

We make an important connection to existing results in econometrics to describe an alternative formulation of inverse reinforcement learning (IRL). In particular, we describe an algorithm using Conditional Choice Probabilities (CCP), which are maximum likelihood estimates of the policy estimated from expert demonstrations, to solve the IRL problem. Using the language of structural econometrics, we re-frame the optimal decision problem and introduce an alternative representation of value functions due to (Hotz and Miller 1993). In addition to presenting the theoretical connections that bridge the IRL literature between Economics and Robotics, the use of CCPs also has the practical benefit of reducing the computational cost of solving the IRL problem. Specifically, under the CCP representation, we show how one can avoid repeated calls to the dynamic programming subroutine typically used in IRL. We show via extensive experimentation on standard IRL benchmarks that CCP-IRL is able to outperform MaxEnt-IRL, with as much as a 5x speedup and without compromising on the quality of the recovered reward function.


Identifiability in inverse reinforcement learning

Inverse reinforcement learning attempts to reconstruct the reward functi...

Task-Guided Inverse Reinforcement Learning Under Partial Information

We study the problem of inverse reinforcement learning (IRL), where the ...

Versatile Inverse Reinforcement Learning via Cumulative Rewards

Inverse Reinforcement Learning infers a reward function from expert demo...

Option Compatible Reward Inverse Reinforcement Learning

Reinforcement learning with complex tasks is a challenging problem. Ofte...

Inverse Reinforcement Learning with Explicit Policy Estimates

Various methods for solving the inverse reinforcement learning (IRL) pro...

Learning Self-Correctable Policies and Value Functions from Demonstrations with Negative Sampling

Imitation learning, followed by reinforcement learning algorithms, is a ...

Inverse Reinforcement Learning for Marketing

Learning customer preferences from an observed behaviour is an important...