Learning Multi-agent Implicit Communication Through Actions: A Case Study in Contract Bridge, a Collaborative Imperfect-Information Game

10/10/2018
by   Zheng Tian, et al.
10

In situations where explicit communication is limited, a human collaborator is typically able to learn to: (i) infer the meaning behind their partner's actions and (ii) balance between taking actions that are exploitative given their current understanding of the state vs. those that can convey private information about the state to their partner. The first component of this learning process has been well-studied in multi-agent systems, whereas the second --- which is equally crucial for a successful collaboration --- has not. In this work, we complete the learning process and introduce our novel algorithm, Policy-Belief-Iteration ("P-BIT"), which mimics both components mentioned above. A belief module models the other agent's private information by observing their actions, whilst a policy module makes use of the inferred private information to return a distribution over actions. They are mutually reinforced with an EM-like algorithm. We use a novel auxiliary reward to encourage information exchange by actions. We evaluate our approach on the non-competitive bidding problem from contract bridge and show that by self-play agents are able to effectively collaborate with implicit communication, and P-BIT outperforms several meaningful baselines that have been considered.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/14/2020

Joint Policy Search for Multi-agent Collaboration with Imperfect Information

To learn good joint policies for multi-agent collaboration with imperfec...
research
06/11/2020

Learning Individually Inferred Communication for Multi-Agent Cooperation

Communication lays the foundation for human cooperation. It is also cruc...
research
11/04/2018

Bayesian Action Decoder for Deep Multi-Agent Reinforcement Learning

When observing the actions of others, humans carry out inferences about ...
research
07/17/2021

Implicit Communication as Minimum Entropy Coupling

In many common-payoff games, achieving good performance requires players...
research
07/09/2020

A Cordial Sync: Going Beyond Marginal Policies for Multi-Agent Embodied Tasks

Autonomous agents must learn to collaborate. It is not scalable to devel...
research
06/14/2022

Universally Expressive Communication in Multi-Agent Reinforcement Learning

Allowing agents to share information through communication is crucial fo...
research
10/24/2022

Interactive inference: a multi-agent model of cooperative joint actions

We advance a novel computational model of multi-agent, cooperative joint...

Please sign up or login with your details

Forgot password? Click here to reset