Decentralised Learning in Systems with Many, Many Strategic Agents

03/13/2018
by   David Mguni, et al.
0

Although multi-agent reinforcement learning can tackle systems of strategically interacting entities, it currently fails in scalability and lacks rigorous convergence guarantees. Crucially, learning in multi-agent systems can become intractable due to the explosion in the size of the state-action space as the number of agents increases. In this paper, we propose a method for computing closed-loop optimal policies in multi-agent systems that scales independently of the number of agents. This allows us to show, for the first time, successful convergence to optimal behaviour in systems with an unbounded number of interacting adaptive learners. Studying the asymptotic regime of N-player stochastic games, we devise a learning protocol that is guaranteed to converge to equilibrium policies even when the number of agents is extremely large. Our method is model-free and completely decentralised so that each agent need only observe its local state information and its realised rewards. We validate these theoretical results by showing convergence to Nash-equilibrium policies in applications from economics and control theory with thousands of strategically interacting agents.

READ FULL TEXT
research
10/21/2020

On Information Asymmetry in Competitive Multi-Agent Reinforcement Learning: Convergence and Optimality

In this work, we study the system of interacting non-cooperative two Q-l...
research
10/01/2020

D3C: Reducing the Price of Anarchy in Multi-Agent Learning

Even in simple multi-agent systems, fixed incentives can lead to outcome...
research
02/02/2022

Data-Driven Behaviour Estimation in Parametric Games

A central question in multi-agent strategic games deals with learning th...
research
06/01/2021

Gradient Play in Multi-Agent Markov Stochastic Games: Stationary Points and Convergence

We study the performance of the gradient play algorithm for multi-agent ...
research
07/20/2017

Consistent Tomography under Partial Observations over Adaptive Networks

This work studies the problem of inferring whether an agent is directly ...
research
02/03/2018

Learning Parametric Closed-Loop Policies for Markov Potential Games

Multiagent systems where the agents interact among themselves and with a...
research
09/16/2016

A Formal Solution to the Grain of Truth Problem

A Bayesian agent acting in a multi-agent environment learns to predict t...

Please sign up or login with your details

Forgot password? Click here to reset