Marginal MAP Estimation for Inverse RL under Occlusion with Observer Noise

09/16/2021
by   Prasanth Sengadu Suresh, et al.
0

We consider the problem of learning the behavioral preferences of an expert engaged in a task from noisy and partially-observable demonstrations. This is motivated by real-world applications such as a line robot learning from observing a human worker, where some observations are occluded by environmental objects that cannot be removed. Furthermore, robotic perception tends to be imperfect and noisy. Previous techniques for inverse reinforcement learning (IRL) take the approach of either omitting the missing portions or inferring it as part of expectation-maximization, which tends to be slow and prone to local optima. We present a new method that generalizes the well-known Bayesian maximum-a-posteriori (MAP) IRL method by marginalizing the occluded portions of the trajectory. This is additionally extended with an observation model to account for perception noise. We show that the marginal MAP (MMAP) approach significantly improves on the previous IRL technique under occlusion in both formative evaluations on a toy problem and in a summative evaluation on an onion sorting line task by a robot.

READ FULL TEXT

page 5

page 6

research
07/13/2021

A Hierarchical Bayesian model for Inverse RL in Partially-Controlled Environments

Robots learning from observations in the real world using inverse reinfo...
research
10/27/2017

Inverse Reinforcement Learning Under Noisy Observations

We consider the problem of performing inverse reinforcement learning whe...
research
08/15/2022

IRL with Partial Observations using the Principle of Uncertain Maximum Entropy

The principle of maximum entropy is a broadly applicable technique for c...
research
05/21/2018

A Framework and Method for Online Inverse Reinforcement Learning

Inverse reinforcement learning (IRL) is the problem of learning the pref...
research
04/27/2020

Maximum Entropy Multi-Task Inverse RL

Multi-task IRL allows for the possibility that the expert could be switc...
research
11/16/2019

Reinforcement Learning from Imperfect Demonstrations under Soft Expert Guidance

In this paper, we study Reinforcement Learning from Demonstrations (RLfD...

Please sign up or login with your details

Forgot password? Click here to reset