Strangeness-driven Exploration in Multi-Agent Reinforcement Learning

12/27/2022
by   Ju-Bong Kim, et al.
0

Efficient exploration strategy is one of essential issues in cooperative multi-agent reinforcement learning (MARL) algorithms requiring complex coordination. In this study, we introduce a new exploration method with the strangeness that can be easily incorporated into any centralized training and decentralized execution (CTDE)-based MARL algorithms. The strangeness refers to the degree of unfamiliarity of the observations that an agent visits. In order to give the observation strangeness a global perspective, it is also augmented with the the degree of unfamiliarity of the visited entire state. The exploration bonus is obtained from the strangeness and the proposed exploration method is not much affected by stochastic transitions commonly observed in MARL tasks. To prevent a high exploration bonus from making the MARL training insensitive to extrinsic rewards, we also propose a separate action-value function trained by both extrinsic reward and exploration bonus, on which a behavioral policy to generate transitions is designed based. It makes the CTDE-based MARL algorithms more stable when they are used with an exploration method. Through a comparative evaluation in didactic examples and the StarCraft Multi-Agent Challenge, we show that the proposed exploration method achieves significant performance improvement in the CTDE-based MARL algorithms.

READ FULL TEXT

page 1

page 7

page 8

research
10/06/2020

UneVEn: Universal Value Exploration for Multi-Agent Reinforcement Learning

This paper focuses on cooperative value-based multi-agent reinforcement ...
research
03/16/2023

Conditionally Optimistic Exploration for Cooperative Deep Multi-Agent Reinforcement Learning

Efficient exploration is critical in cooperative deep Multi-Agent Reinfo...
research
07/25/2020

Catastrophe by Design in Population Games: Destabilizing Wasteful Locked-in Technologies

In multi-agent environments in which coordination is desirable, the hist...
research
10/16/2019

MAVEN: Multi-Agent Variational Exploration

Centralised training with decentralised execution is an important settin...
research
01/26/2022

Exploiting Semantic Epsilon Greedy Exploration Strategy in Multi-Agent Reinforcement Learning

Multi-agent reinforcement learning (MARL) can model many real world appl...
research
06/16/2020

Local Information Opponent Modelling Using Variational Autoencoders

Modelling the behaviours of other agents (opponents) is essential for un...
research
03/16/2023

SVDE: Scalable Value-Decomposition Exploration for Cooperative Multi-Agent Reinforcement Learning

Value-decomposition methods, which reduce the difficulty of a multi-agen...

Please sign up or login with your details

Forgot password? Click here to reset