DeepAI AI Chat
Log In Sign Up

Lenient Multi-Agent Deep Reinforcement Learning

by   Gregory Palmer, et al.
University of Liverpool
Centrum Wiskunde & Informatica

A significant amount of research in recent years has been dedicated towards single agent deep reinforcement learning. Much of the success of deep reinforcement learning can be attributed towards the use of experience replay memories within which state transitions are stored. Function approximation methods such as convolutional neural networks (referred to as deep Q-Networks, or DQNs, in this context) can subsequently be trained through sampling the stored transitions. However, considerations are required when using experience replay memories within multi-agent systems, as stored transitions can become outdated due to agents updating their respective policies in parallel [1]. In this work we apply leniency [2] to multi-agent deep reinforcement learning (MA-DRL), acting as a control mechanism to determine which state-transitions sampled are allowed to update the DQN. Our resulting Lenient-DQN (LDQN) is evaluated using variations of the Coordinated Multi-Agent Object Transportation Problem (CMOTP) outlined by Busoniu et al. [3]. The LDQN significantly outperforms the existing hysteretic DQN (HDQN) [4] within environments that yield stochastic rewards. Based on results from experiments conducted using vanilla and double Q-learning versions of the lenient and hysteretic algorithms, we advocate a hybrid approach where learners initially use vanilla Q-learning before transitioning to double Q-learners upon converging on a cooperative joint policy.


page 5

page 7


MAGNet: Multi-agent Graph Network for Deep Multi-agent Reinforcement Learning

Over recent years, deep reinforcement learning has shown strong successe...

Weighted Double Deep Multiagent Reinforcement Learning in Stochastic Cooperative Environments

Despite single agent deep reinforcement learning has achieved significan...

Negative Update Intervals in Deep Multi-Agent Reinforcement Learning

In Multi-Agent Reinforcement Learning, independent cooperative learners ...

Double Deep Q-Learning in Opponent Modeling

Multi-agent systems in which secondary agents with conflicting agendas a...

Deep Reinforcement Learning and the Deadly Triad

We know from reinforcement learning theory that temporal difference lear...

An adaptive synchronization approach for weights of deep reinforcement learning

Deep Q-Networks (DQN) is one of the most well-known methods of deep rein...

Halftoning with Multi-Agent Deep Reinforcement Learning

Deep neural networks have recently succeeded in digital halftoning using...