Exploration with Unreliable Intrinsic Reward in Multi-Agent Reinforcement Learning

06/05/2019
by   Wendelin Böhmer, et al.
14

This paper investigates the use of intrinsic reward to guide exploration in multi-agent reinforcement learning. We discuss the challenges in applying intrinsic reward to multiple collaborative agents and demonstrate how unreliable reward can prevent decentralized agents from learning the optimal policy. We address this problem with a novel framework, Independent Centrally-assisted Q-learning (ICQL), in which decentralized agents share control and an experience replay buffer with a centralized agent. Only the centralized agent is intrinsically rewarded, but the decentralized agents still benefit from improved exploration, without the distraction of unreliable incentives.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/20/2022

MASER: Multi-Agent Reinforcement Learning with Subgoals Generated from Experience Replay Buffer

In this paper, we consider cooperative multi-agent reinforcement learnin...
research
10/09/2022

ELIGN: Expectation Alignment as a Multi-Agent Intrinsic Reward

Modern multi-agent reinforcement learning frameworks rely on centralized...
research
07/18/2019

Prioritized Guidance for Efficient Multi-Agent Reinforcement Learning Exploration

Exploration efficiency is a challenging problem in multi-agent reinforce...
research
02/06/2019

CESMA: Centralized Expert Supervises Multi-Agents

We consider the reinforcement learning problem of training multiple agen...
research
10/12/2019

Influence-Based Multi-Agent Exploration

Intrinsically motivated reinforcement learning aims to address the explo...
research
02/28/2023

On Learning Intrinsic Rewards for Faster Multi-Agent Reinforcement Learning based MAC Protocol Design in 6G Wireless Networks

In this paper, we propose a novel framework for designing a fast converg...
research
10/14/2021

Provably Efficient Multi-Agent Reinforcement Learning with Fully Decentralized Communication

A challenge in reinforcement learning (RL) is minimizing the cost of sam...

Please sign up or login with your details

Forgot password? Click here to reset