Toward Solving 2-TBSG Efficiently

by   Zeyu Jia, et al.
Peking University
Stanford University

2-TBSG is a two-player game model which aims to find Nash equilibriums and is widely utilized in reinforced learning and AI. Inspired by the fact that the simplex method for solving the deterministic discounted Markov decision processes (MDPs) is strongly polynomial independent of the discounted factor, we are trying to answer an open problem whether there is a similar algorithm for 2-TBSG. We develop a simplex strategy iteration where one player updates its strategy with a simplex step while the other player finds an optimal counterstrategy in turn, and a modified simplex strategy iteration. Both of them belong to a class of geometrically converging algorithms. We establish the strongly polynomial property of these algorithms by considering a strategy combined from the current strategy and the equilibrium strategy. Moreover, we present a method to transform general 2-TBSGs into special 2-TBSGs where each state has exactly two actions.


page 1

page 2

page 3

page 4


Efficient Strategy Iteration for Mean Payoff in Markov Decision Processes

Markov decision processes (MDPs) are standard models for probabilistic s...

Solving Discounted Stochastic Two-Player Games with Near-Optimal Time and Sample Complexity

In this paper, we settle the sampling complexity of solving discounted t...

Fast Planning in Stochastic Games

Stochastic games generalize Markov decision processes (MDPs) to a multia...

When is Offline Two-Player Zero-Sum Markov Game Solvable?

We study what dataset assumption permits solving offline two-player zero...

Qualitative Multi-Objective Reachability for Ordered Branching MDPs

We study qualitative multi-objective reachability problems for Ordered B...

A Lattice-Theoretical View of Strategy Iteration

Strategy iteration is a technique frequently used for two-player games i...

Please sign up or login with your details

Forgot password? Click here to reset