Offline Learning in Markov Games with General Function Approximation

02/06/2023
by   Yuheng Zhang, et al.
3

We study offline multi-agent reinforcement learning (RL) in Markov games, where the goal is to learn an approximate equilibrium – such as Nash equilibrium and (Coarse) Correlated Equilibrium – from an offline dataset pre-collected from the game. Existing works consider relatively restricted tabular or linear models and handle each equilibria separately. In this work, we provide the first framework for sample-efficient offline learning in Markov games under general function approximation, handling all 3 equilibria in a unified manner. By using Bellman-consistent pessimism, we obtain interval estimation for policies' returns, and use both the upper and the lower bounds to obtain a relaxation on the gap of a candidate policy, which becomes our optimization objective. Our results generalize prior works and provide several additional insights. Importantly, we require a data coverage condition that improves over the recently proposed "unilateral concentrability". Our condition allows selective coverage of deviation policies that optimally trade-off between their greediness (as approximate best responses) and coverage, and we show scenarios where this leads to significantly better guarantees. As a new connection, we also show how our algorithmic framework can subsume seemingly different solution concepts designed for the special case of two-player zero-sum games.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/23/2023

Zero-sum Polymatrix Markov Games: Equilibrium Collapse and Efficient Computation of Nash Equilibria

The works of (Daskalakis et al., 2009, 2022; Jin et al., 2022; Deng et a...
research
12/27/2021

Can Reinforcement Learning Find Stackelberg-Nash Equilibria in General-Sum Markov Games with Myopic Followers?

We study multi-player general-sum Markov games with one of the players d...
research
10/24/2022

Offline congestion games: How feedback type affects data coverage requirement

This paper investigates when one can efficiently recover an approximate ...
research
10/08/2021

When Can We Learn General-Sum Markov Games with a Large Number of Players Sample-Efficiently?

Multi-agent reinforcement learning has made substantial empirical progre...
research
06/25/2018

Sum-of-Squares meets Nash: Optimal Lower Bounds for Finding any Equilibrium

Several works have shown unconditional hardness (via integrality gaps) o...
research
04/08/2022

The Complexity of Markov Equilibrium in Stochastic Games

We show that computing approximate stationary Markov coarse correlated e...
research
03/14/2022

Learning Markov Games with Adversarial Opponents: Efficient Algorithms and Fundamental Limits

An ideal strategy in zero-sum games should not only grant the player an ...

Please sign up or login with your details

Forgot password? Click here to reset