Toward Solving 2-TBSG Efficiently

06/09/2019
by   Zeyu Jia, et al.
0

2-TBSG is a two-player game model which aims to find Nash equilibriums and is widely utilized in reinforced learning and AI. Inspired by the fact that the simplex method for solving the deterministic discounted Markov decision processes (MDPs) is strongly polynomial independent of the discounted factor, we are trying to answer an open problem whether there is a similar algorithm for 2-TBSG. We develop a simplex strategy iteration where one player updates its strategy with a simplex step while the other player finds an optimal counterstrategy in turn, and a modified simplex strategy iteration. Both of them belong to a class of geometrically converging algorithms. We establish the strongly polynomial property of these algorithms by considering a strategy combined from the current strategy and the equilibrium strategy. Moreover, we present a method to transform general 2-TBSGs into special 2-TBSGs where each state has exactly two actions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/16/2019

Method for Constructing Artificial Intelligence Player with Abstraction to Markov Decision Processes in Multiplayer Game of Mahjong

We propose a method for constructing artificial intelligence (AI) of mah...
research
07/06/2017

Efficient Strategy Iteration for Mean Payoff in Markov Decision Processes

Markov decision processes (MDPs) are standard models for probabilistic s...
research
08/29/2019

Solving Discounted Stochastic Two-Player Games with Near-Optimal Time and Sample Complexity

In this paper, we settle the sampling complexity of solving discounted t...
research
01/16/2013

Fast Planning in Stochastic Games

Stochastic games generalize Markov decision processes (MDPs) to a multia...
research
01/10/2022

When is Offline Two-Player Zero-Sum Markov Game Solvable?

We study what dataset assumption permits solving offline two-player zero...
research
08/24/2020

Qualitative Multi-Objective Reachability for Ordered Branching MDPs

We study qualitative multi-objective reachability problems for Ordered B...
research
07/20/2022

A Lattice-Theoretical View of Strategy Iteration

Strategy iteration is a technique frequently used for two-player games i...

Please sign up or login with your details

Forgot password? Click here to reset