Continual Depth-limited Responses for Computing Counter-strategies in Extensive-form Games

12/23/2021
by   David Milec, et al.
0

In real-world applications, game-theoretic algorithms often interact with imperfect opponents, and incorporating opponent models into the algorithms can significantly improve performance. Opponent exploitation approaches often use the best response or robust response to compute counter-strategy to an opponent model updated during the game-play or to build a portfolio of exploitative strategies beforehand. However, in massive games with imperfect information, computing exact responses is intractable. Existing approaches for best response approximation are either domain-specific or require an extensive computation for every opponent model. Furthermore, there is no approach that can compute robust responses in massive games. We propose using depth-limited solving with optimal value function to approximate the best response and restricted Nash response. Both approaches require computing the value function beforehand, but then allow computing the responses quickly even to previously unseen opponents. Furthermore, we provide a utility lower bound for both approaches and a safety guarantee for the robust response. Our best response approach can also be used for evaluating the quality of strategies computed by novel algorithms through approximating exploitability. We empirically evaluate the approaches in terms of gain and exploitability, compare the depth-limited responses with the poker-specific local best response, and show the robust response indeed has an upper bound on exploitability.

READ FULL TEXT

page 7

page 12

research
05/31/2019

Value Functions for Depth-Limited Solving in Zero-Sum Imperfect-Information Games

Depth-limited look-ahead search is an essential tool for agents playing ...
research
05/21/2018

Depth-Limited Solving for Imperfect-Information Games

A fundamental challenge in imperfect-information games is that states do...
research
04/20/2020

Approximate exploitability: Learning a best response in large games

A common metric in games of imperfect information is exploitability, i.e...
research
10/13/2022

Competition among Parallel Contests

We investigate the model of multiple contests held in parallel, where ea...
research
03/10/2016

Bayesian Opponent Exploitation in Imperfect-Information Games

Two fundamental problems in computational game theory are computing a Na...
research
02/15/2022

NeuPL: Neural Population Learning

Learning in strategy games (e.g. StarCraft, poker) requires the discover...
research
09/30/2020

Complexity and Algorithms for Exploiting Quantal Opponents in Large Two-Player Games

Solution concepts of traditional game theory assume entirely rational pl...

Please sign up or login with your details

Forgot password? Click here to reset