Approximate exploitability: Learning a best response in large games

04/20/2020
by   Finbarr Timbers, et al.
10

A common metric in games of imperfect information is exploitability, i.e. the performance of a policy against the worst-case opponent. This metric has many nice properties, but is intractable to compute in large games as it requires a full search of the game tree to calculate a best response to the given policy. We introduce a new metric, approximate exploitability, that calculates an analogous metric to exploitability using an approximate best response. This method scales to large games with tractable belief spaces. We focus only on the two-player, zero-sum case. Additionally, we provide empirical results for a specific instance of the method, demonstrating that it can effectively exploit agents in large games. We demonstrate that our method converges to exploitability in the tabular setting and the function approximation setting for small games, and demonstrate that it can consistently find exploits for weak policies in large games, showing results on Chess, Go, Heads-up No Limit Texas Hold'em, and other games.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/13/2019

Computing Approximate Equilibria in Sequential Adversarial Games by Exploitability Descent

In this paper, we present exploitability descent, a new algorithm to com...
research
10/03/2017

Equilibrium Computation and Robust Optimization in Zero Sum Games with Submodular Structure

We define a class of zero-sum games with combinatorial structure, where ...
research
06/15/2020

Pipeline PSRO: A Scalable Approach for Finding Approximate Nash Equilibria in Large Games

Finding approximate Nash equilibria in zero-sum imperfect-information ga...
research
03/14/2021

Modelling Behavioural Diversity for Learning in Open-Ended Games

Promoting behavioural diversity is critical for solving games with non-t...
research
12/23/2021

Continual Depth-limited Responses for Computing Counter-strategies in Extensive-form Games

In real-world applications, game-theoretic algorithms often interact wit...
research
04/07/2021

Knowledge-Based Paranoia Search in Trick-Taking

This paper proposes knowledge-based paraonoia search (KBPS) to find forc...
research
02/15/2022

NeuPL: Neural Population Learning

Learning in strategy games (e.g. StarCraft, poker) requires the discover...

Please sign up or login with your details

Forgot password? Click here to reset