Unlocking the Potential of Deep Counterfactual Value Networks

07/20/2020
by   Ryan Zarick, et al.
1

Deep counterfactual value networks combined with continual resolving provide a way to conduct depth-limited search in imperfect-information games. However, since their introduction in the DeepStack poker AI, deep counterfactual value networks have not seen widespread adoption. In this paper we introduce several improvements to deep counterfactual value networks, as well as counterfactual regret minimization, and analyze the effects of each change. We combined these improvements to create the poker AI Supremus. We show that while a reimplementation of DeepStack loses head-to-head against the strong benchmark agent Slumbot, Supremus successfully beats Slumbot by an extremely large margin and also achieves a lower exploitability than DeepStack against a local best response. Together, these results show that with our key improvements, deep counterfactual value networks can achieve state-of-the-art performance.

READ FULL TEXT
research
11/01/2018

Deep Counterfactual Regret Minimization

Counterfactual Regret Minimization (CFR) is the leading algorithm for so...
research
01/22/2019

Single Deep Counterfactual Regret Minimization

Counterfactual Regret Minimization (CFR) is the most successful algorith...
research
05/26/2021

NNCFR: Minimize Counterfactual Regret with Neural Networks

Counterfactual Regret Minimization (CFR) is the popular method for findi...
research
10/11/2021

Equivalence Analysis between Counterfactual Regret Minimization and Online Mirror Descent

Counterfactual Regret Minimization (CFR) is a kind of regret minimizatio...
research
04/24/2019

Solving zero-sum extensive-form games with arbitrary payoff uncertainty models

Modeling strategic conflict from a game theoretical perspective involves...
research
09/04/2023

Pure Monte Carlo Counterfactual Regret Minimization

Counterfactual Regret Minimization (CFR) and its variants are the best a...
research
10/06/2021

Consistent Counterfactuals for Deep Models

Counterfactual examples are one of the most commonly-cited methods for e...

Please sign up or login with your details

Forgot password? Click here to reset