Learning Stationary Nash Equilibrium Policies in n-Player Stochastic Games with Independent Chains via Dual Mirror Descent

01/28/2022
by   S. Rasoul Etesami, et al.
0

We consider a subclass of n-player stochastic games, in which players have their own internal state/action spaces while they are coupled through their payoff functions. It is assumed that players' internal chains are driven by independent transition probabilities. Moreover, players can receive only realizations of their payoffs, not the actual functions, and cannot observe each other's states/actions. Under some assumptions on the structure of the payoff functions, we develop efficient learning algorithms based on dual averaging and dual mirror descent, which provably converge almost surely or in expectation to the set of ϵ-Nash equilibrium policies. In particular, we derive upper bounds on the number of iterates that scale polynomially in terms of the game parameters to achieve an ϵ-Nash equilibrium policy. In addition to Markov potential games and linear-quadratic stochastic games, this work provides another subclass of n-player stochastic games that provably admit polynomial-time learning algorithms for finding their ϵ-Nash equilibrium policies.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/27/2020

Stochastic Potential Games

Computing the Nash equilibrium (NE) for N-player non-zerosum stochastic ...
research
10/18/2021

Empirical Policy Optimization for n-Player Markov Games

In single-agent Markov decision processes, an agent can optimize its pol...
research
11/06/2017

Performance Analysis of Trial and Error Algorithms

Model-free decentralized optimizations and learning are receiving increa...
research
03/31/2023

Soft-Bellman Equilibrium in Affine Markov Games: Forward Solutions and Inverse Learning

Markov games model interactions among multiple players in a stochastic, ...
research
06/02/2021

Sample-based Approximation of Nash in Large Many-Player Games via Gradient Descent

Nash equilibrium is a central concept in game theory. Several Nash solve...
research
02/24/2021

Using Inverse Optimization to Learn Cost Functions in Generalized Nash Games

As demonstrated by Ratliff et al. (2014), inverse optimization can be us...
research
06/30/2021

On the Convergence of Stochastic Extragradient for Bilinear Games with Restarted Iteration Averaging

We study the stochastic bilinear minimax optimization problem, presentin...

Please sign up or login with your details

Forgot password? Click here to reset