Improving Bidding and Playing Strategies in the Trick-Taking game Wizard using Deep Q-Networks

05/27/2022
by   Jonas Schumacher, et al.
0

In this work, the trick-taking game Wizard with a separate bidding and playing phase is modeled by two interleaved partially observable Markov decision processes (POMDP). Deep Q-Networks (DQN) are used to empower self-improving agents, which are capable of tackling the challenges of a highly non-stationary environment. To compare algorithms between each other, the accuracy between bid and trick count is monitored, which strongly correlates with the actual rewards and provides a well-defined upper and lower performance bound. The trained DQN agents achieve accuracies between 66 self-play, leaving behind both a random baseline and a rule-based heuristic. The conducted analysis also reveals a strong information asymmetry concerning player positions during bidding. To overcome the missing Markov property of imperfect-information games, a long short-term memory (LSTM) network is implemented to integrate historic information into the decision-making process. Additionally, a forward-directed tree search is conducted by sampling a state of the environment and thereby turning the game into a perfect information setting. To our surprise, both approaches do not surpass the performance of the basic DQN agent.

READ FULL TEXT
research
12/06/2021

Player of Games

Games have a long history of serving as a benchmark for progress in arti...
research
02/25/2019

Similarity Measures based on Local Game Trees

We study strategic similarity of game positions in two-player extensive ...
research
03/08/2021

Bandit Linear Optimization for Sequential Decision Making and Extensive-Form Games

Tree-form sequential decision making (TFSDM) extends classical one-shot ...
research
06/04/2018

Shallow decision-making analysis in General Video Game Playing

The General Video Game AI competitions have been the testing ground for ...
research
08/14/2018

Improving Hearthstone AI by Combining MCTS and Supervised Learning Algorithms

We investigate the impact of supervised prediction models on the strengt...
research
03/03/2019

Competitive Bridge Bidding with Deep Neural Networks

The game of bridge consists of two stages: bidding and playing. While pl...
research
05/06/2019

Comprehensible Context-driven Text Game Playing

In order to train a computer agent to play a text-based computer game, w...

Please sign up or login with your details

Forgot password? Click here to reset