Learning Policies from Human Data for Skat

05/27/2019
by   Douglas Rebstock, et al.
0

Decision-making in large imperfect information games is difficult. Thanks to recent success in Poker, Counterfactual Regret Minimization (CFR) methods have been at the forefront of research in these games. However, most of the success in large games comes with the use of a forward model and powerful state abstractions. In trick-taking card games like Bridge or Skat, large information sets and an inability to advance the simulation without fully determinizing the state make forward search problematic. Furthermore, state abstractions can be especially difficult to construct because the precise holdings of each player directly impact move values. In this paper we explore learning model-free policies for Skat from human game data using deep neural networks (DNN). We produce a new state-of-the-art system for bidding and game declaration by introducing methods to a) directly vary the aggressiveness of the bidder and b) declare games based on expected value while mitigating issues with rarely observed state-action pairs. Although cardplay policies learned through imitation are slightly weaker than the current best search-based method, they run orders of magnitude faster. We also explore how these policies could be learned directly from experience in a reinforcement learning setting and discuss the value of incorporating human data for this task.

READ FULL TEXT
research
07/22/2023

CFR-p: Counterfactual Regret Minimization with Hierarchical Policy Abstraction, and its Application to Two-player Mahjong

Counterfactual Regret Minimization(CFR) has shown its success in Texas H...
research
02/28/2020

Reinforcement Learning in FlipIt

Reinforcement learning has shown much success in games such as chess, ba...
research
03/01/2019

Model-Based Reinforcement Learning for Atari

Model-free reinforcement learning (RL) can be used to learn effective po...
research
05/16/2022

Deep Apprenticeship Learning for Playing Games

In the last decade, deep learning has achieved great success in machine ...
research
12/03/2020

Model-free Neural Counterfactual Regret Minimization with Bootstrap Learning

Counterfactual Regret Minimization (CFR) has achieved many fascinating r...
research
09/10/2020

RLCFR: Minimize Counterfactual Regret by Deep Reinforcement Learning

Counterfactual regret minimization (CFR) is a popular method to deal wit...
research
12/20/2022

Adapting the Exploration Rate for Value-of-Information-Based Reinforcement Learning

In this paper, we consider the problem of adjusting the exploration rate...

Please sign up or login with your details

Forgot password? Click here to reset