Tesseract: Tensorised Actors for Multi-Agent Reinforcement Learning

05/31/2021
by   Anuj Mahajan, et al.
33

Reinforcement Learning in large action spaces is a challenging problem. Cooperative multi-agent reinforcement learning (MARL) exacerbates matters by imposing various constraints on communication and observability. In this work, we consider the fundamental hurdle affecting both value-based and policy-gradient approaches: an exponential blowup of the action space with the number of agents. For value-based methods, it poses challenges in accurately representing the optimal value function. For policy gradient methods, it makes training the critic difficult and exacerbates the problem of the lagging critic. We show that from a learning theory perspective, both problems can be addressed by accurately representing the associated action-value function with a low-complexity hypothesis class. This requires accurately modelling the agent interactions in a sample efficient way. To this end, we propose a novel tensorised formulation of the Bellman equation. This gives rise to our method Tesseract, which views the Q-function as a tensor whose modes correspond to the action spaces of different agents. Algorithms derived from Tesseract decompose the Q-tensor across agents and utilise low-rank tensor approximations to model agent interactions relevant to the task. We provide PAC analysis for Tesseract-based algorithms and highlight their relevance to the class of rich observation MDPs. Empirical results in different domains confirm Tesseract's gains in sample efficiency predicted by the theory.

READ FULL TEXT

page 8

page 17

page 18

research
04/05/2021

NQMIX: Non-monotonic Value Function Factorization for Deep Multi-Agent Reinforcement Learning

Multi-agent value-based approaches recently make great progress, especia...
research
01/18/2021

Cooperative and Competitive Biases for Multi-Agent Reinforcement Learning

Training a multi-agent reinforcement learning (MARL) algorithm is more c...
research
10/27/2021

Model based Multi-agent Reinforcement Learning with Tensor Decompositions

A challenge in multi-agent reinforcement learning is to be able to gener...
research
10/27/2021

Reinforcement Learning in Factored Action Spaces using Tensor Decompositions

We present an extended abstract for the previously published work TESSER...
research
07/20/2021

Reinforcement learning autonomously identifying the source of errors for agents in a group mission

When agents are swarmed to carry out a mission, there is often a sudden ...
research
02/08/2023

Efficient Planning in Combinatorial Action Spaces with Applications to Cooperative Multi-Agent Reinforcement Learning

A practical challenge in reinforcement learning are combinatorial action...
research
02/23/2023

Revisiting the Gumbel-Softmax in MADDPG

MADDPG is an algorithm in multi-agent reinforcement learning (MARL) that...

Please sign up or login with your details

Forgot password? Click here to reset