Continual Depth-limited Responses for Computing Counter-strategies in Extensive-form Games

12/23/2021
by   David Milec, et al.
0

In real-world applications, game-theoretic algorithms often interact with imperfect opponents, and incorporating opponent models into the algorithms can significantly improve performance. Opponent exploitation approaches often use the best response or robust response to compute counter-strategy to an opponent model updated during the game-play or to build a portfolio of exploitative strategies beforehand. However, in massive games with imperfect information, computing exact responses is intractable. Existing approaches for best response approximation are either domain-specific or require an extensive computation for every opponent model. Furthermore, there is no approach that can compute robust responses in massive games. We propose using depth-limited solving with optimal value function to approximate the best response and restricted Nash response. Both approaches require computing the value function beforehand, but then allow computing the responses quickly even to previously unseen opponents. Furthermore, we provide a utility lower bound for both approaches and a safety guarantee for the robust response. Our best response approach can also be used for evaluating the quality of strategies computed by novel algorithms through approximating exploitability. We empirically evaluate the approaches in terms of gain and exploitability, compare the depth-limited responses with the poker-specific local best response, and show the robust response indeed has an upper bound on exploitability.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset