AlphaZero-Inspired General Board Game Learning and Playing

by   Johannes Scheiermann, et al.

Recently, the seminal algorithms AlphaGo and AlphaZero have started a new era in game learning and deep reinforcement learning. While the achievements of AlphaGo and AlphaZero - playing Go and other complex games at super human level - are truly impressive, these architectures have the drawback that they are very complex and require high computational resources. Many researchers are looking for methods that are similar to AlphaZero, but have lower computational demands and are thus more easily reproducible. In this paper, we pick an important element of AlphaZero - the Monte Carlo Tree Search (MCTS) planning stage - and combine it with reinforcement learning (RL) agents. We wrap MCTS for the first time around RL n-tuple networks to create versatile agents that keep at the same time the computational demands low. We apply this new architecture to several complex games (Othello, ConnectFour, Rubik's Cube) and show the advantages achieved with this AlphaZero-inspired MCTS wrapper. In particular, we present results that this AlphaZero-inspired agent is the first one trained on standard hardware (no GPU or TPU) to beat the very strong Othello program Edax up to and including level 7 (where most other algorithms could only defeat Edax up to level 2).


page 4

page 5


Solving Royal Game of Ur Using Reinforcement Learning

Reinforcement Learning has recently surfaced as a very powerful tool to ...

Learning to play the Chess Variant Crazyhouse above World Champion Level with Deep Neural Networks and Human Data

Deep neural networks have been successfully applied in learning the boar...

Assessing the Potential of Classical Q-learning in General Game Playing

After the recent groundbreaking results of AlphaGo and AlphaZero, we hav...

Applying supervised and reinforcement learning methods to create neural-network-based agents for playing StarCraft II

Recently, multiple approaches for creating agents for playing various co...

Q-DeckRec: A Fast Deck Recommendation System for Collectible Card Games

Deck building is a crucial component in playing Collectible Card Games (...

Towards Learning Rubik's Cube with N-tuple-based Reinforcement Learning

This work describes in detail how to learn and solve the Rubik's cube ga...

Warm-Start AlphaZero Self-Play Search Enhancements

Recently, AlphaZero has achieved landmark results in deep reinforcement ...

Please sign up or login with your details

Forgot password? Click here to reset