Training an Interactive Helper

06/24/2019
by   Mark Woodward, et al.
1

Developing agents that can quickly adapt their behavior to new tasks remains a challenge. Meta-learning has been applied to this problem, but previous methods require either specifying a reward function which can be tedious or providing demonstrations which can be inefficient. In this paper, we investigate if, and how, a "helper" agent can be trained to interactively adapt their behavior to maximize the reward of another agent, whom we call the "prime" agent, without observing their reward or receiving explicit demonstrations. To this end, we propose to meta-learn a helper agent along with a prime agent, who, during training, observes the reward function and serves as a surrogate for a human prime. We introduce a distribution of multi-agent cooperative foraging tasks, in which only the prime agent knows the objects that should be collected. We demonstrate that, from the emerged physical communication, the trained helper rapidly infers and collects the correct objects.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/24/2019

Feudal Multi-Agent Hierarchies for Cooperative Reinforcement Learning

We investigate how reinforcement learning agents can learn to cooperate....
research
06/24/2019

Learning to Interactively Learn and Assist

When deploying autonomous agents in the real world, we need to think abo...
research
10/07/2022

Robotic Control Using Model Based Meta Adaption

In machine learning, meta-learning methods aim for fast adaptability to ...
research
03/18/2021

Human-Inspired Multi-Agent Navigation using Knowledge Distillation

Despite significant advancements in the field of multi-agent navigation,...
research
06/06/2019

An Extensible Interactive Interface for Agent Design

In artificial intelligence, we often specify tasks through a reward func...
research
10/11/2017

Specification Inference from Demonstrations

Learning from expert demonstrations has received a lot of attention in a...
research
02/12/2020

Reward-rational (implicit) choice: A unifying formalism for reward learning

It is often difficult to hand-specify what the correct reward function i...

Please sign up or login with your details

Forgot password? Click here to reset