Optimize Neural Fictitious Self-Play in Regret Minimization Thinking

04/22/2021
by   Yuxuan Chen, et al.
5

Optimization of deep learning algorithms to approach Nash Equilibrium remains a significant problem in imperfect information games, e.g. StarCraft and poker. Neural Fictitious Self-Play (NFSP) has provided an effective way to learn approximate Nash Equilibrium without prior domain knowledge in imperfect information games. However, optimality gap was left as an optimization problem of NFSP and by solving the problem, the performance of NFSP could be improved. In this study, focusing on the optimality gap of NFSP, we have proposed a new method replacing NFSP's best response computation with regret matching method. The new algorithm can make the optimality gap converge to zero as it iterates, thus converge faster than original NFSP. We have conduct experiments on three typical environments of perfect-information games and imperfect information games in OpenSpiel and all showed that our new algorithm performances better than original NFSP.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/22/2019

Monte Carlo Neural Fictitious Self-Play: Approach to Approximate Nash equilibrium of Imperfect-Information Games

Researchers on artificial intelligence have achieved human-level intelli...
research
03/22/2019

Monte Carlo Neural Fictitious Self-Play: Achieve Approximate Nash equilibrium of Imperfect-Information Games

Researchers on artificial intelligence have achieved human-level intelli...
research
03/13/2019

Computing Approximate Equilibria in Sequential Adversarial Games by Exploitability Descent

In this paper, we present exploitability descent, a new algorithm to com...
research
09/12/2016

Reduced Space and Faster Convergence in Imperfect-Information Games via Regret-Based Pruning

Counterfactual Regret Minimization (CFR) is the most popular iterative a...
research
06/18/2020

DREAM: Deep Regret minimization with Advantage baselines and Model-free learning

We introduce DREAM, a deep reinforcement learning algorithm that finds o...
research
11/28/2014

Solving Games with Functional Regret Estimation

We propose a novel online learning method for minimizing regret in large...
research
10/15/2021

Combining Counterfactual Regret Minimization with Information Gain to Solve Extensive Games with Imperfect Information

Counterfactual regret Minimization (CFR) is an effective algorithm for s...

Please sign up or login with your details

Forgot password? Click here to reset