Learning Markov Games with Adversarial Opponents: Efficient Algorithms and Fundamental Limits

03/14/2022
by   Qinghua Liu, et al.
0

An ideal strategy in zero-sum games should not only grant the player an average reward no less than the value of Nash equilibrium, but also exploit the (adaptive) opponents when they are suboptimal. While most existing works in Markov games focus exclusively on the former objective, it remains open whether we can achieve both objectives simultaneously. To address this problem, this work studies no-regret learning in Markov games with adversarial opponents when competing against the best fixed policy in hindsight. Along this direction, we present a new complete set of positive and negative results: When the policies of the opponents are revealed at the end of each episode, we propose new efficient algorithms achieving √(K)-regret bounds when either (1) the baseline policy class is small or (2) the opponent's policy class is small. This is complemented with an exponential lower bound when neither conditions are true. When the policies of the opponents are not revealed, we prove a statistical hardness result even in the most favorable scenario when both above conditions are true. Our hardness result is much stronger than the existing hardness results which either only involve computational hardness, or require further restrictions on the algorithms.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/22/2020

Near-Optimal Reinforcement Learning with Self-Play

This paper considers the problem of designing optimal algorithms for rei...
research
07/04/2020

Off-Policy Exploitability-Evaluation and Equilibrium-Learning in Two-Player Zero-Sum Markov Games

Off-policy evaluation (OPE) is the problem of evaluating new policies us...
research
06/03/2022

Decentralized Optimistic Hyperpolicy Mirror Descent: Provably No-Regret Learning in Markov Games

We study decentralized policy learning in Markov games where we control ...
research
03/22/2023

Hardness of Independent Learning and Sparse Equilibrium Computation in Markov Games

We consider the problem of decentralized multi-agent reinforcement learn...
research
07/13/2023

Multi-Player Zero-Sum Markov Games with Networked Separable Interactions

We study a new class of Markov games (MGs), Multi-player Zero-sum Markov...
research
02/06/2023

Offline Learning in Markov Games with General Function Approximation

We study offline multi-agent reinforcement learning (RL) in Markov games...
research
02/27/2020

Tree Polymatrix Games are PPAD-hard

We prove that it is PPAD-hard to compute a Nash equilibrium in a tree po...

Please sign up or login with your details

Forgot password? Click here to reset