Hierarchical Deep Counterfactual Regret Minimization

05/27/2023
by   Jiayu Chen, et al.
0

Imperfect Information Games (IIGs) offer robust models for scenarios where decision-makers face uncertainty or lack complete information. Counterfactual Regret Minimization (CFR) has been one of the most successful family of algorithms for tackling IIGs. The integration of skill-based strategy learning with CFR could potentially enhance learning performance for complex IIGs. For this, a hierarchical strategy needs to be learnt, wherein low-level components represent specific skills and the high-level component manages the transition between skills. This hierarchical approach also enhances interpretability, helping humans pinpoint scenarios where the agent is struggling and intervene with targeted expertise. This paper introduces the first hierarchical version of Deep CFR (HDCFR), an innovative method that boosts learning efficiency in tasks involving extensively large state spaces and deep game trees. A notable advantage of HDCFR over previous research in this field is its ability to facilitate learning with predefined (human) expertise and foster the acquisition of transferable skills that can be applied to similar tasks. To achieve this, we initially construct our algorithm on a tabular setting, encompassing hierarchical CFR updating rules and a variance-reduced Monte-Carlo sampling extension, and offer its essential theoretical guarantees. Then, to adapt our algorithm for large-scale applications, we employ neural networks as function approximators and suggest deep learning objectives that coincide with those in the tabular setting while maintaining the theoretical outcomes.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/27/2018

Double Neural Counterfactual Regret Minimization

Counterfactual Regret Minimization (CRF) is a fundamental and effective ...
research
07/22/2023

CFR-p: Counterfactual Regret Minimization with Hierarchical Policy Abstraction, and its Application to Two-player Mahjong

Counterfactual Regret Minimization(CFR) has shown its success in Texas H...
research
05/26/2021

NNCFR: Minimize Counterfactual Regret with Neural Networks

Counterfactual Regret Minimization (CFR) is the popular method for findi...
research
01/22/2019

Single Deep Counterfactual Regret Minimization

Counterfactual Regret Minimization (CFR) is the most successful algorith...
research
09/04/2023

Pure Monte Carlo Counterfactual Regret Minimization

Counterfactual Regret Minimization (CFR) and its variants are the best a...
research
12/03/2020

Model-free Neural Counterfactual Regret Minimization with Bootstrap Learning

Counterfactual Regret Minimization (CFR) has achieved many fascinating r...

Please sign up or login with your details

Forgot password? Click here to reset