Greedy UnMixing for Q-Learning in Multi-Agent Reinforcement Learning

09/19/2021
by   Chapman Siu, et al.
0

This paper introduces Greedy UnMix (GUM) for cooperative multi-agent reinforcement learning (MARL). Greedy UnMix aims to avoid scenarios where MARL methods fail due to overestimation of values as part of the large joint state-action space. It aims to address this through a conservative Q-learning approach through restricting the state-marginal in the dataset to avoid unobserved joint state action spaces, whilst concurrently attempting to unmix or simplify the problem space under the centralized training with decentralized execution paradigm. We demonstrate the adherence to Q-function lower bounds in the Q-learning for MARL scenarios, and demonstrate superior performance to existing Q-learning MARL approaches as well as more general MARL algorithms over a set of benchmark MARL tasks, despite its relative simplicity compared with state-of-the-art approaches.

READ FULL TEXT

page 8

page 10

research
08/15/2022

Transformer-based Value Function Decomposition for Cooperative Multi-agent Reinforcement Learning in StarCraft

The StarCraft II Multi-Agent Challenge (SMAC) was created to be a challe...
research
05/14/2019

QTRAN: Learning to Factorize with Transformation for Cooperative Multi-Agent Reinforcement Learning

We explore value-based solutions for multi-agent reinforcement learning ...
research
03/28/2022

UNMAS: Multi-Agent Reinforcement Learning for Unshaped Cooperative Scenarios

Multi-agent reinforcement learning methods such as VDN, QMIX, and QTRAN ...
research
11/11/2019

SMIX(λ): Enhancing Centralized Value Functions for Cooperative Multi-Agent Reinforcement Learning

Learning a stable and generalizable centralized value function (CVF) is ...
research
12/22/2020

QVMix and QVMix-Max: Extending the Deep Quality-Value Family of Algorithms to Cooperative Multi-Agent Reinforcement Learning

This paper introduces four new algorithms that can be used for tackling ...
research
01/26/2022

Exploiting Semantic Epsilon Greedy Exploration Strategy in Multi-Agent Reinforcement Learning

Multi-agent reinforcement learning (MARL) can model many real world appl...
research
05/19/2022

Beyond Greedy Search: Tracking by Multi-Agent Reinforcement Learning-based Beam Search

Existing trackers usually select a location or proposal with the maximum...

Please sign up or login with your details

Forgot password? Click here to reset