Learning to Guide and to Be Guided in the Architect-Builder Problem

12/14/2021
by   Paul Barde, et al.
0

We are interested in interactive agents that learn to coordinate, namely, a builder – which performs actions but ignores the goal of the task, i.e. has no access to rewards – and an architect which guides the builder towards the goal of the task. We define and explore a formal setting where artificial agents are equipped with mechanisms that allow them to simultaneously learn a task while at the same time evolving a shared communication protocol. Ideally, such learning should only rely on high-level communication priors and be able to handle a large variety of tasks and meanings while deriving communication protocols that can be reused across tasks. We present the Architect-Builder Problem (ABP): an asymmetrical setting in which an architect must learn to guide a builder towards constructing a specific structure. The architect knows the target structure but cannot act in the environment and can only send arbitrary messages to the builder. The builder on the other hand can act in the environment, but receives no rewards nor has any knowledge about the task, and must learn to solve it relying only on the messages sent by the architect. Crucially, the meaning of messages is initially not defined nor shared between the agents but must be negotiated throughout learning. Under these constraints, we propose Architect-Builder Iterated Guiding (ABIG), a solution to ABP where the architect leverages a learned model of the builder to guide it while the builder uses self-imitation learning to reinforce its guided behavior. We analyze the key learning mechanisms of ABIG and test it in 2D tasks involving grasping cubes, placing them at a given location, or building various shapes. ABIG results in a low-level, high-frequency, guiding communication protocol that not only enables an architect-builder pair to solve the task at hand, but that can also generalize to unseen tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/26/2018

TarMAC: Targeted Multi-Agent Communication

We explore a collaborative multi-agent reinforcement learning setting wh...
research
08/14/2019

Mastering emergent language: learning to guide in simulated navigation

To cooperate with humans effectively, virtual agents need to be able to ...
research
10/14/2020

Self-Imitation Learning in Sparse Reward Settings

The application of reinforcement learning (RL) in real-world is still li...
research
02/08/2016

Learning to Communicate to Solve Riddles with Deep Distributed Recurrent Q-Networks

We propose deep distributed recurrent Q-networks (DDRQN), which enable t...
research
02/25/2023

Simulation of robot swarms for learning communication-aware coordination

Robotics research has been focusing on cooperative multi-agent problems,...
research
12/04/2018

Compositional Imitation Learning: Explaining and executing one task at a time

We introduce a framework for Compositional Imitation Learning and Execut...
research
12/11/2019

Learning to Request Guidance in Emergent Communication

Previous research into agent communication has shown that a pre-trained ...

Please sign up or login with your details

Forgot password? Click here to reset