Solving Zero-Sum One-Sided Partially Observable Stochastic Games

by   Karel Horák, et al.

Many security and other real-world situations are dynamic in nature and can be modelled as strictly competitive (or zero-sum) dynamic games. In these domains, agents perform actions to affect the environment and receive observations – possibly imperfect – about the situation and the effects of the opponent's actions. Moreover, there is no limitation on the total number of actions an agent can perform – that is, there is no fixed horizon. These settings can be modelled as partially observable stochastic games (POSGs). However, solving general POSGs is computationally intractable, so we focus on a broad subclass of POSGs called one-sided POSGs. In these games, only one agent has imperfect information while their opponent has full knowledge of the current situation. We provide a full picture for solving one-sided POSGs: we (1) give a theoretical analysis of one-sided POSGs and their value functions, (2) show that a variant of a value-iteration algorithm converges in this setting, (3) adapt the heuristic search value-iteration algorithm for solving one-sided POSGs, (4) describe how to use approximate value functions to derive strategies in the game, and (5) demonstrate that our algorithm can solve one-sided POSGs of non-trivial sizes and analyze the scalability of our algorithm in three different domains: pursuit-evasion, patrolling, and search games.


page 1

page 2

page 3

page 4


HSVI can solve zero-sum Partially Observable Stochastic Games

State-of-the-art methods for solving 2-player zero-sum imperfect informa...

Adapting to game trees in zero-sum imperfect information games

Imperfect information games (IIG) are games in which each player only pa...

On Bellman's Optimality Principle for zs-POSGs

Many non-trivial sequential decision-making problems are efficiently sol...

Combining Prediction of Human Decisions with ISMCTS in Imperfect Information Games

Monte Carlo Tree Search (MCTS) has been extended to many imperfect infor...

Compact Representation of Value Function in Partially Observable Stochastic Games

Value methods for solving stochastic games with partial observability mo...

Value Functions for Depth-Limited Solving in Zero-Sum Imperfect-Information Games

Depth-limited look-ahead search is an essential tool for agents playing ...

Search in Imperfect Information Games

From the very dawn of the field, search with value functions was a funda...

Please sign up or login with your details

Forgot password? Click here to reset