ELIGN: Expectation Alignment as a Multi-Agent Intrinsic Reward

10/09/2022
by   Zixian Ma, et al.
0

Modern multi-agent reinforcement learning frameworks rely on centralized training and reward shaping to perform well. However, centralized training and dense rewards are not readily available in the real world. Current multi-agent algorithms struggle to learn in the alternative setup of decentralized training or sparse rewards. To address these issues, we propose a self-supervised intrinsic reward ELIGN - expectation alignment - inspired by the self-organization principle in Zoology. Similar to how animals collaborate in a decentralized manner with those in their vicinity, agents trained with expectation alignment learn behaviors that match their neighbors' expectations. This allows the agents to learn collaborative behaviors without any external reward or centralized training. We demonstrate the efficacy of our approach across 6 tasks in the multi-agent particle and the complex Google Research football environments, comparing ELIGN to sparse and curiosity-based intrinsic rewards. When the number of agents increases, ELIGN scales well in all multi-agent tasks except for one where agents have different capabilities. We show that agent coordination improves through expectation alignment because agents learn to divide tasks amongst themselves, break coordination symmetries, and confuse adversaries. These results identify tasks where expectation alignment is a more useful strategy than curiosity-driven exploration for multi-agent coordination, enabling agents to do zero-shot coordination.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/05/2019

Exploration with Unreliable Intrinsic Reward in Multi-Agent Reinforcement Learning

This paper investigates the use of intrinsic reward to guide exploration...
research
09/08/2013

Regret-Based Multi-Agent Coordination with Uncertain Task Rewards

Many multi-agent coordination problems can be represented as DCOPs. Moti...
research
08/16/2021

Decentralized Multi-AGV Task Allocation based on Multi-Agent Reinforcement Learning with Information Potential Field Rewards

Automated Guided Vehicles (AGVs) have been widely used for material hand...
research
10/19/2019

A Structured Prediction Approach for Generalization in Cooperative Multi-Agent Reinforcement Learning

Effective coordination is crucial to solve multi-agent collaborative (MA...
research
07/05/2022

The StarCraft Multi-Agent Challenges+ : Learning of Multi-Stage Tasks and Environmental Factors without Precise Reward Functions

In this paper, we propose a novel benchmark called the StarCraft Multi-A...
research
03/17/2022

Strategic Maneuver and Disruption with Reinforcement Learning Approaches for Multi-Agent Coordination

Reinforcement learning (RL) approaches can illuminate emergent behaviors...
research
10/16/2020

PRIMAL2: Pathfinding via Reinforcement and Imitation Multi-Agent Learning – Lifelong

Multi-agent path finding (MAPF) is an indispensable component of large-s...

Please sign up or login with your details

Forgot password? Click here to reset