A Bayesian Approach to Policy Recognition and State Representation Learning

05/04/2016
by   Adrian Šošić, et al.
0

Learning from demonstration (LfD) is the process of building behavioral models of a task from demonstrations provided by an expert. These models can be used e.g. for system control by generalizing the expert demonstrations to previously unencountered situations. Most LfD methods, however, make strong assumptions about the expert behavior, e.g. they assume the existence of a deterministic optimal ground truth policy or require direct monitoring of the expert's controls, which limits their practical use as part of a general system identification framework. In this work, we consider the LfD problem in a more general setting where we allow for arbitrary stochastic expert policies, without reasoning about the optimality of the demonstrations. Following a Bayesian methodology, we model the full posterior distribution of possible expert controllers that explain the provided demonstration data. Moreover, we show that our methodology can be applied in a nonparametric context to infer the complexity of the state representation used by the expert, and to learn task-appropriate partitionings of the system state space.

READ FULL TEXT
research
01/04/2021

Robust Maximum Entropy Behavior Cloning

Imitation learning (IL) algorithms use expert demonstrations to learn a ...
research
03/23/2021

Ground Truths for the Humanities

Ensuring a faithful interaction with data and its representation for hum...
research
05/23/2022

Efficient Reinforcement Learning from Demonstration Using Local Ensemble and Reparameterization with Split and Merge of Expert Policies

The current work on reinforcement learning (RL) from demonstrations ofte...
research
06/17/2021

Learning from Demonstration without Demonstrations

State-of-the-art reinforcement learning (RL) algorithms suffer from high...
research
09/13/2020

Toward the Fundamental Limits of Imitation Learning

Imitation learning (IL) aims to mimic the behavior of an expert policy i...
research
06/15/2023

Behavioral Cloning via Search in Embedded Demonstration Dataset

Behavioural cloning uses a dataset of demonstrations to learn a behaviou...
research
12/20/2021

Demonstration Informed Specification Search

This paper considers the problem of learning history dependent task spec...

Please sign up or login with your details

Forgot password? Click here to reset