Risk Averse Bayesian Reward Learning for Autonomous Navigation from Human Demonstration

07/31/2021
by   Christian Ellis, et al.
0

Traditional imitation learning provides a set of methods and algorithms to learn a reward function or policy from expert demonstrations. Learning from demonstration has been shown to be advantageous for navigation tasks as it allows for machine learning non-experts to quickly provide information needed to learn complex traversal behaviors. However, a minimal set of demonstrations is unlikely to capture all relevant information needed to achieve the desired behavior in every possible future operational environment. Due to distributional shift among environments, a robot may encounter features that were rarely or never observed during training for which the appropriate reward value is uncertain, leading to undesired outcomes. This paper proposes a Bayesian technique which quantifies uncertainty over the weights of a linear reward function given a dataset of minimal human demonstrations to operate safely in dynamic environments. This uncertainty is quantified and incorporated into a risk averse set of weights used to generate cost maps for planning. Experiments in a 3-D environment with a simulated robot show that our proposed algorithm enables a robot to avoid dangerous terrain completely in two out of three test scenarios and accumulates a lower amount of risk than related approaches in all scenarios without requiring any additional demonstrations.

READ FULL TEXT

page 1

page 5

page 6

page 7

research
07/24/2020

Bayesian Robust Optimization for Imitation Learning

One of the main challenges in imitation learning is determining what act...
research
06/11/2021

Policy Gradient Bayesian Robust Optimization for Imitation Learning

The difficulty in specifying rewards for many real-world problems has le...
research
11/23/2021

Sample Efficient Imitation Learning via Reward Function Trained in Advance

Imitation learning (IL) is a framework that learns to imitate expert beh...
research
05/06/2022

Robot navigation from human demonstration: learning control behaviors with environment feature maps

When working alongside human collaborators in dynamic and unstructured e...
research
11/18/2021

Assisted Robust Reward Design

Real-world robotic tasks require complex reward functions. When we defin...
research
11/08/2018

Learning from Demonstration in the Wild

Learning from demonstration (LfD) is useful in settings where hand-codin...
research
08/05/2020

Learning from Sparse Demonstrations

This paper proposes an approach which enables a robot to learn an object...

Please sign up or login with your details

Forgot password? Click here to reset