Provably Efficient Offline Multi-agent Reinforcement Learning via Strategy-wise Bonus

06/01/2022
by   Qiwen Cui, et al.
0

This paper considers offline multi-agent reinforcement learning. We propose the strategy-wise concentration principle which directly builds a confidence interval for the joint strategy, in contrast to the point-wise concentration principle that builds a confidence interval for each point in the joint action space. For two-player zero-sum Markov games, by exploiting the convexity of the strategy-wise bonus, we propose a computationally efficient algorithm whose sample complexity enjoys a better dependency on the number of actions than the prior methods based on the point-wise bonus. Furthermore, for offline multi-agent general-sum Markov games, based on the strategy-wise bonus and a novel surrogate function, we give the first algorithm whose sample complexity only scales ∑_i=1^mA_i where A_i is the action size of the i-th player and m is the number of players. In sharp contrast, the sample complexity of methods based on the point-wise bonus would scale with the size of the joint action space Π_i=1^m A_i due to the curse of multiagents. Lastly, all of our algorithms can naturally take a pre-specified strategy class Π as input and output a strategy that is close to the best strategy in Π. In this setting, the sample complexity only scales with log |Π| instead of ∑_i=1^mA_i.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/10/2022

When is Offline Two-Player Zero-Sum Markov Game Solvable?

We study what dataset assumption permits solving offline two-player zero...
research
10/04/2020

A Sharp Analysis of Model-based Reinforcement Learning with Self-Play

Model-based algorithms—algorithms that decouple learning of the model an...
research
04/08/2022

The Complexity of Markov Equilibrium in Stochastic Games

We show that computing approximate stationary Markov coarse correlated e...
research
10/08/2021

When Can We Learn General-Sum Markov Games with a Large Number of Players Sample-Efficiently?

Multi-agent reinforcement learning has made substantial empirical progre...
research
10/30/2022

Representation Learning for General-sum Low-rank Markov Games

We study multi-agent general-sum Markov games with nonlinear function ap...
research
07/30/2021

Towards General Function Approximation in Zero-Sum Markov Games

This paper considers two-player zero-sum finite-horizon Markov games wit...
research
06/22/2023

Achieving Sample and Computational Efficient Reinforcement Learning by Action Space Reduction via Grouping

Reinforcement learning often needs to deal with the exponential growth o...

Please sign up or login with your details

Forgot password? Click here to reset