Minimax Sample Complexity for Turn-based Stochastic Game

11/29/2020
by   Qiwen Cui, et al.
0

The empirical success of Multi-agent reinforcement learning is encouraging, while few theoretical guarantees have been revealed. In this work, we prove that the plug-in solver approach, probably the most natural reinforcement learning algorithm, achieves minimax sample complexity for turn-based stochastic game (TBSG). Specifically, we plan in an empirical TBSG by utilizing a `simulator' that allows sampling from arbitrary state-action pair. We show that the empirical Nash equilibrium strategy is an approximate Nash equilibrium strategy in the true TBSG and give both problem-dependent and problem-independent bound. We develop absorbing TBSG and reward perturbation techniques to tackle the complex statistical dependence. The key idea is artificially introducing a suboptimality gap in TBSG and then the Nash equilibrium strategy lies in a finite set.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/07/2019

Mixed Strategy Game Model Against Data Poisoning Attacks

In this paper we use game theory to model poisoning attack scenarios. We...
research
05/27/2023

Reinforcement Learning With Reward Machines in Stochastic Games

We investigate multi-agent reinforcement learning for stochastic games w...
research
03/28/2019

A Stay-in-a-Set Game without a Stationary Equilibrium

We give an example of a finite-state two-player turn-based stochastic ga...
research
01/01/2019

A Theoretical Analysis of Deep Q-Learning

Despite the great empirical success of deep reinforcement learning, its ...
research
03/01/2023

Finite-sample Guarantees for Nash Q-learning with Linear Function Approximation

Nash Q-learning may be considered one of the first and most known algori...
research
04/01/2021

Back to Square One: Superhuman Performance in Chutes and Ladders Through Deep Neural Networks and Tree Search

We present AlphaChute: a state-of-the-art algorithm that achieves superh...
research
11/17/2020

Allocating marketing resources over social networks: A long-term analysis

In this paper, we consider a network of consumers who are under the comb...

Please sign up or login with your details

Forgot password? Click here to reset