RadGrad: Active learning with loss gradients

06/18/2019
by   Paul Budnarain, et al.
0

Solving sequential decision prediction problems, including those in imitation learning settings, requires mitigating the problem of covariate shift. The standard approach, DAgger, relies on capturing expert behaviour in all states that the agent reaches. In real-world settings, querying an expert is costly. We propose a new active learning algorithm that selectively queries the expert, based on both a prediction of agent error and a proxy for agent risk, that maintains the performance of unrestrained expert querying systems while substantially reducing the number of expert queries made. We show that our approach, RadGrad, has the potential to improve upon existing safety-aware algorithms, and matches or exceeds the performance of DAgger and variants (i.e., SafeDAgger) in one simulated environment. However, we also find that a more complex environment poses challenges not only to our proposed method, but also to existing safety-aware algorithms, which do not match the performance of DAgger in our experiments.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/16/2012

Active Imitation Learning via Reduction to I.I.D. Active Learning

In standard passive imitation learning, the goal is to learn a target po...
research
05/26/2020

Active Imitation Learning with Noisy Guidance

Imitation learning algorithms provide state-of-the-art results on many s...
research
07/01/2019

Active Learning within Constrained Environments through Imitation of an Expert Questioner

Active learning agents typically employ a query selection algorithm whic...
research
06/08/2020

Primal Wasserstein Imitation Learning

Imitation Learning (IL) methods seek to match the behavior of an agent w...
research
03/08/2023

Embodied Active Learning of Relational State Abstractions for Bilevel Planning

State abstraction is an effective technique for planning in robotics env...
research
09/24/2019

Avoidance Learning Using Observational Reinforcement Learning

Imitation learning seeks to learn an expert policy from sampled demonstr...
research
05/23/2022

Data augmentation for efficient learning from parametric experts

We present a simple, yet powerful data-augmentation technique to enable ...

Please sign up or login with your details

Forgot password? Click here to reset