Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning

02/28/2017
by   Jakob Foerster, et al.
0

Many real-world problems, such as network packet routing and urban traffic control, are naturally modeled as multi-agent reinforcement learning (RL) problems. However, existing multi-agent RL methods typically scale poorly in the problem size. Therefore, a key challenge is to translate the success of deep learning on single-agent RL to the multi-agent setting. A major stumbling block is that independent Q-learning, the most popular multi-agent RL method, introduces nonstationarity that makes it incompatible with the experience replay memory on which deep Q-learning relies. This paper proposes two methods that address this problem: 1) using a multi-agent variant of importance sampling to naturally decay obsolete data and 2) conditioning each agent's value function on a fingerprint that disambiguates the age of the data sampled from the replay memory. Results on a challenging decentralised variant of StarCraft unit micromanagement confirm that these methods enable the successful combination of experience replay with multi-agent RL.

READ FULL TEXT

page 4

page 5

research
02/21/2023

MAC-PO: Multi-Agent Experience Replay via Collective Priority Optimization

Experience replay is crucial for off-policy reinforcement learning (RL) ...
research
03/24/2022

Remember and Forget Experience Replay for Multi-Agent Reinforcement Learning

We present the extension of the Remember and Forget for Experience Repla...
research
01/25/2023

Discriminative Experience Replay for Efficient Multi-agent Reinforcement Learning

In cooperative multi-agent tasks, parameter sharing among agents is a co...
research
07/28/2021

Packet Routing with Graph Attention Multi-agent Reinforcement Learning

Packet routing is a fundamental problem in communication networks that d...
research
11/22/2021

Plan Better Amid Conservatism: Offline Multi-Agent Reinforcement Learning with Actor Rectification

The idea of conservatism has led to significant progress in offline rein...
research
08/06/2020

Deep Q-Network Based Multi-agent Reinforcement Learning with Binary Action Agents

Deep Q-Network (DQN) based multi-agent systems (MAS) for reinforcement l...
research
07/03/2022

USHER: Unbiased Sampling for Hindsight Experience Replay

Dealing with sparse rewards is a long-standing challenge in reinforcemen...

Please sign up or login with your details

Forgot password? Click here to reset