Exploiting Semantic Epsilon Greedy Exploration Strategy in Multi-Agent Reinforcement Learning

01/26/2022
by   Hon Tik Tse, et al.
0

Multi-agent reinforcement learning (MARL) can model many real world applications. However, many MARL approaches rely on epsilon greedy for exploration, which may discourage visiting advantageous states in hard scenarios. In this paper, we propose a new approach QMIX(SEG) for tackling MARL. It makes use of the value function factorization method QMIX to train per-agent policies and a novel Semantic Epsilon Greedy (SEG) exploration strategy. SEG is a simple extension to the conventional epsilon greedy exploration strategy, yet it is experimentally shown to greatly improve the performance of MARL. We first cluster actions into groups of actions with similar effects and then use the groups in a bi-level epsilon greedy exploration hierarchy for action selection. We argue that SEG facilitates semantic exploration by exploring in the space of groups of actions, which have richer semantic meanings than atomic actions. Experiments show that QMIX(SEG) largely outperforms QMIX and leads to strong performance competitive with current state-of-the-art MARL approaches on the StarCraft Multi-Agent Challenge (SMAC) benchmark.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/03/2023

Toward Risk-based Optimistic Exploration for Cooperative Multi-Agent Reinforcement Learning

The multi-agent setting is intricate and unpredictable since the behavio...
research
05/19/2022

Beyond Greedy Search: Tracking by Multi-Agent Reinforcement Learning-based Beam Search

Existing trackers usually select a location or proposal with the maximum...
research
12/27/2022

Strangeness-driven Exploration in Multi-Agent Reinforcement Learning

Efficient exploration strategy is one of essential issues in cooperative...
research
07/05/2018

Goal-oriented Trajectories for Efficient Exploration

Exploration is a difficult challenge in reinforcement learning and even ...
research
09/19/2021

Greedy UnMixing for Q-Learning in Multi-Agent Reinforcement Learning

This paper introduces Greedy UnMix (GUM) for cooperative multi-agent rei...
research
06/20/2022

Sampling Efficient Deep Reinforcement Learning through Preference-Guided Stochastic Exploration

Massive practical works addressed by Deep Q-network (DQN) algorithm have...
research
03/14/2022

Stubborn: A Strong Baseline for Indoor Object Navigation

We present a strong baseline that surpasses the performance of previousl...

Please sign up or login with your details

Forgot password? Click here to reset