Agent-Temporal Attention for Reward Redistribution in Episodic Multi-Agent Reinforcement Learning

01/12/2022
by   Baicen Xiao, et al.
0

This paper considers multi-agent reinforcement learning (MARL) tasks where agents receive a shared global reward at the end of an episode. The delayed nature of this reward affects the ability of the agents to assess the quality of their actions at intermediate time-steps. This paper focuses on developing methods to learn a temporal redistribution of the episodic reward to obtain a dense reward signal. Solving such MARL problems requires addressing two challenges: identifying (1) relative importance of states along the length of an episode (along time), and (2) relative importance of individual agents' states at any single time-step (among agents). In this paper, we introduce Agent-Temporal Attention for Reward Redistribution in Episodic Multi-Agent Reinforcement Learning (AREL) to address these two challenges. AREL uses attention mechanisms to characterize the influence of actions on state transitions along trajectories (temporal attention), and how each agent is affected by other agents at each time-step (agent attention). The redistributed rewards predicted by AREL are dense, and can be integrated with any given MARL algorithm. We evaluate AREL on challenging tasks from the Particle World environment and the StarCraft Multi-Agent Challenge. AREL results in higher rewards in Particle World, and improved win rates in StarCraft compared to three state-of-the-art reward redistribution methods. Our code is available at https://github.com/baicenxiao/AREL.

READ FULL TEXT
research
03/29/2021

Shaping Advice in Deep Multi-Agent Reinforcement Learning

Multi-agent reinforcement learning involves multiple agents interacting ...
research
07/16/2021

Decentralized Multi-Agent Reinforcement Learning for Task Offloading Under Uncertainty

Multi-Agent Reinforcement Learning (MARL) is a challenging subarea of Re...
research
05/08/2023

Information Design in Multi-Agent Reinforcement Learning

Reinforcement learning (RL) mimics how humans and animals interact with ...
research
12/22/2022

Certified Policy Smoothing for Cooperative Multi-Agent Reinforcement Learning

Cooperative multi-agent reinforcement learning (c-MARL) is widely applie...
research
04/24/2021

baller2vec++: A Look-Ahead Multi-Entity Transformer For Modeling Coordinated Agents

In many multi-agent spatiotemporal systems, the agents are under the inf...
research
04/14/2021

A Novel Approach to Curiosity and Explainable Reinforcement Learning via Interpretable Sub-Goals

Two key challenges within Reinforcement Learning involve improving (a) a...
research
04/06/2017

Geometry of Policy Improvement

We investigate the geometry of optimal memoryless time independent decis...

Please sign up or login with your details

Forgot password? Click here to reset