Specification Inference from Demonstrations

Learning from expert demonstrations has received a lot of attention in artificial intelligence and machine learning. The goal is to infer the underlying reward function that an agent is optimizing given a set of observations of the agent's behavior over time in a variety of circumstances, the system state trajectories, and a plant model specifying the evolution of the system state for different agent's actions. The system is often modeled as a Markov decision process, that is, the next state depends only on the current state and agent's action, and the the agent's choice of action depends only on the current state. While the former is a Markovian assumption on the evolution of system state, the later assumes that the target reward function is itself Markovian. In this work, we explore learning a class of non-Markovian reward functions, known in the formal methods literature as specifications. These specifications offer better composition, transferability, and interpretability. We then show that inferring the specification can be done efficiently without unrolling the transition system. We demonstrate on a 2-d grid world example.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/13/2023

Imitation from Observation With Bootstrapped Contrastive Learning

Imitation from observation (IfO) is a learning paradigm that consists of...
research
05/28/2021

Task-Guided Inverse Reinforcement Learning Under Partial Information

We study the problem of inverse reinforcement learning (IRL), where the ...
research
09/26/2020

Online Learning of Non-Markovian Reward Models

There are situations in which an agent should receive rewards only after...
research
10/22/2017

Safety-Aware Apprenticeship Learning

Apprenticeship learning (AL) is a class of "learning from demonstrations...
research
10/02/2019

CWAE-IRL: Formulating a supervised approach to Inverse Reinforcement Learning problem

Inverse reinforcement learning (IRL) is used to infer the reward functio...
research
11/24/2022

Discovering Generalizable Spatial Goal Representations via Graph-based Active Reward Learning

In this work, we consider one-shot imitation learning for object rearran...
research
06/24/2019

Training an Interactive Helper

Developing agents that can quickly adapt their behavior to new tasks rem...

Please sign up or login with your details

Forgot password? Click here to reset