UneVEn: Universal Value Exploration for Multi-Agent Reinforcement Learning

10/06/2020
by   Tarun Gupta, et al.
4

This paper focuses on cooperative value-based multi-agent reinforcement learning (MARL) in the paradigm of centralized training with decentralized execution (CTDE). Current state-of-the-art value-based MARL methods leverage CTDE to learn a centralized joint-action value function as a monotonic mixing of each agent's utility function, which enables easy decentralization. However, this monotonic restriction leads to inefficient exploration in tasks with nonmonotonic returns due to suboptimal approximations of the values of joint actions. To address this, we present a novel MARL approach called Universal Value Exploration (UneVEn), which uses universal successor features (USFs) to learn policies of tasks related to the target task, but with simpler reward functions in a sample efficient manner. UneVEn uses novel action-selection schemes between randomly sampled related tasks during exploration, which enables the monotonic joint-action value function of the target task to place more importance on useful joint actions. Empirical results on a challenging cooperative predator-prey task requiring significant coordination amongst agents show that UneVEn significantly outperforms state-of-the-art baselines.

READ FULL TEXT

page 7

page 16

research
06/22/2020

QOPT: Optimistic Value Function Decentralization for Cooperative Multi-Agent Reinforcement Learning

We propose a novel value-based algorithm for cooperative multi-agent rei...
research
08/03/2020

QPLEX: Duplex Dueling Multi-Agent Q-Learning

We explore value-based multi-agent reinforcement learning (MARL) in the ...
research
06/18/2020

Weighted QMIX: Expanding Monotonic Value Function Factorisation

QMIX is a popular Q-learning algorithm for cooperative MARL in the centr...
research
03/16/2023

Conditionally Optimistic Exploration for Cooperative Deep Multi-Agent Reinforcement Learning

Efficient exploration is critical in cooperative deep Multi-Agent Reinfo...
research
12/23/2021

Local Advantage Networks for Cooperative Multi-Agent Reinforcement Learning

Multi-agent reinforcement learning (MARL) enables us to create adaptive ...
research
12/27/2022

Strangeness-driven Exploration in Multi-Agent Reinforcement Learning

Efficient exploration strategy is one of essential issues in cooperative...
research
03/30/2018

QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning

In many real-world settings, a team of agents must coordinate their beha...

Please sign up or login with your details

Forgot password? Click here to reset