Learning Multi-Object Positional Relationships via Emergent Communication

02/16/2023
by   Yicheng Feng, et al.
0

The study of emergent communication has been dedicated to interactive artificial intelligence. While existing work focuses on communication about single objects or complex image scenes, we argue that communicating relationships between multiple objects is important in more realistic tasks, but understudied. In this paper, we try to fill this gap and focus on emergent communication about positional relationships between two objects. We train agents in the referential game where observations contain two objects, and find that generalization is the major problem when the positional relationship is involved. The key factor affecting the generalization ability of the emergent language is the input variation between Speaker and Listener, which is realized by a random image generator in our work. Further, we find that the learned language can generalize well in a new multi-step MDP task where the positional relationship describes the goal, and performs better than raw-pixel images as well as pre-trained image features, verifying the strong generalization ability of discrete sequences. We also show that language transfer from the referential game performs better in the new task than learning language directly in this task, implying the potential benefits of pre-training in referential games. All in all, our experiments demonstrate the viability and merit of having agents learn to communicate positional relationships between multiple objects through emergent communication.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/02/2021

PatchGame: Learning to Signal Mid-level Patches in Referential Games

We study a referential game (a type of signaling game) where two agents ...
research
02/21/2020

Learning Dynamic Knowledge Graphs to Generalize on Text-Based Games

Playing text-based games requires skill in processing natural language a...
research
05/30/2022

VLUE: A Multi-Task Benchmark for Evaluating Vision-Language Models

Recent advances in vision-language pre-training (VLP) have demonstrated ...
research
07/07/2023

SpawnNet: Learning Generalizable Visuomotor Skills from Pre-trained Networks

The existing internet-scale image and video datasets cover a wide range ...
research
07/20/2023

Pre-train, Adapt and Detect: Multi-Task Adapter Tuning for Camouflaged Object Detection

Camouflaged object detection (COD), aiming to segment camouflaged object...
research
11/30/2020

Facilitating the Communication of Politeness through Fine-Grained Paraphrasing

Aided by technology, people are increasingly able to communicate across ...
research
05/23/2016

Towards Multi-Agent Communication-Based Language Learning

We propose an interactive multimodal framework for language learning. In...

Please sign up or login with your details

Forgot password? Click here to reset