HSVI can solve zero-sum Partially Observable Stochastic Games

10/26/2022
by   Aurélien Delage, et al.
0

State-of-the-art methods for solving 2-player zero-sum imperfect information games rely on linear programming or regret minimization, though not on dynamic programming (DP) or heuristic search (HS), while the latter are often at the core of state-of-the-art solvers for other sequential decision-making problems. In partially observable or collaborative settings (e.g., POMDPs and Dec- POMDPs), DP and HS require introducing an appropriate statistic that induces a fully observable problem as well as bounding (convex) approximators of the optimal value function. This approach has succeeded in some subclasses of 2-player zero-sum partially observable stochastic games (zs- POSGs) as well, but how to apply it in the general case still remains an open question. We answer it by (i) rigorously defining an equivalent game to work with, (ii) proving mathematical properties of the optimal value function that allow deriving bounds that come with solution strategies, (iii) proposing for the first time an HSVI-like solver that provably converges to an ϵ-optimal solution in finite time, and (iv) empirically analyzing it. This opens the door to a novel family of promising approaches complementing those relying on linear programming or iterative methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/25/2021

HSVI fo zs-POSGs using Concavity, Convexity and Lipschitz Properties

Dynamic programming and heuristic search are at the core of state-of-the...
research
06/22/2016

Structure in the Value Function of Two-Player Zero-Sum Games of Incomplete Information

Zero-sum stochastic games provide a rich model for competitive decision ...
research
06/29/2020

On Bellman's Optimality Principle for zs-POSGs

Many non-trivial sequential decision-making problems are efficiently sol...
research
10/21/2020

Solving Zero-Sum One-Sided Partially Observable Stochastic Games

Many security and other real-world situations are dynamic in nature and ...
research
03/13/2019

Compact Representation of Value Function in Partially Observable Stochastic Games

Value methods for solving stochastic games with partial observability mo...
research
02/05/2020

Partially Observable Games for Secure Autonomy

Technology development efforts in autonomy and cyber-defense have been e...
research
09/28/2018

The Partially Observable Games We Play for Cyber Deception

Progressively intricate cyber infiltration mechanisms have made conventi...

Please sign up or login with your details

Forgot password? Click here to reset