Accounting for Human Learning when Inferring Human Preferences

11/11/2020
by   Harry Giles, et al.
0

Inverse reinforcement learning (IRL) is a common technique for inferring human preferences from data. Standard IRL techniques tend to assume that the human demonstrator is stationary, that is that their policy π doesn't change over time. In practice, humans interacting with a novel environment or performing well on a novel task will change their demonstrations as they learn more about the environment or task. We investigate the consequences of relaxing this assumption of stationarity, in particular by modelling the human as learning. Surprisingly, we find in some small examples that this can lead to better inference than if the human was stationary. That is, by observing a demonstrator who is themselves learning, a machine can infer more than by observing a demonstrator who is noisily rational. In addition, we find evidence that misspecification can lead to poor inference, suggesting that modelling human learning is important, especially when the human is facing an unfamiliar environment.

READ FULL TEXT
research
08/09/2022

Basis for Intentions: Efficient Inverse Reinforcement Learning using Past Experience

This paper addresses the problem of inverse reinforcement learning (IRL)...
research
12/06/2022

Misspecification in Inverse Reinforcement Learning

The aim of Inverse Reinforcement Learning (IRL) is to infer a reward fun...
research
12/15/2017

Impossibility of deducing preferences and rationality from human policy

Inverse reinforcement learning (IRL) attempts to infer human rewards or ...
research
06/09/2022

Pragmatically Learning from Pedagogical Demonstrations in Multi-Goal Environments

Learning from demonstration methods usually leverage close to optimal de...
research
06/19/2021

Learning the Preferences of Uncertain Humans with Inverse Decision Theory

Existing observational approaches for learning human preferences, such a...
research
04/19/2023

Applying Learning-from-observation to household service robots: three common-sense formulation

Utilizing a robot in a new application requires the robot to be programm...
research
09/07/2021

Forward and Inverse models in HCI:Physical simulation and deep learning for inferring 3D finger pose

We outline the role of forward and inverse modelling approaches in the d...

Please sign up or login with your details

Forgot password? Click here to reset