Biasing MCTS with Features for General Games

03/21/2019
by   Dennis J. N. J. Soemers, et al.
0

This paper proposes using a linear function approximator, rather than a deep neural network (DNN), to bias a Monte Carlo tree search (MCTS) player for general games. This is unlikely to match the potential raw playing strength of DNNs, but has advantages in terms of generality, interpretability and resources (time and hardware) required for training. Features describing local patterns are used as inputs. The features are formulated in such a way that they are easily interpretable and applicable to a wide range of general games, and might encode simple local strategies. We gradually create new features during the same self-play training process used to learn feature weights. We evaluate the playing strength of an MCTS player biased by learnt features against a standard upper confidence bounds for trees (UCT) player in multiple different board games, and demonstrate significantly improved playing strength in the majority of them after a small number of self-play training games.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/17/2022

Spatial State-Action Features for General Games

In many board games and other abstract games, patterns have been used as...
research
11/11/2021

AlphaDDA: game artificial intelligence with dynamic difficulty adjustment using AlphaZero

An artificial intelligence (AI) player has obtained superhuman skill for...
research
03/13/2020

Accelerating and Improving AlphaZero Using Population Based Training

AlphaZero has been very successful in many games. Unfortunately, it stil...
research
08/19/2019

Learning to play the Chess Variant Crazyhouse above World Champion Level with Deep Neural Networks and Human Data

Deep neural networks have been successfully applied in learning the boar...
research
12/14/2021

Split Moves for Monte-Carlo Tree Search

In many games, moves consist of several decisions made by the player. Th...
research
01/17/2021

Solving QSAT problems with neural MCTS

Recent achievements from AlphaZero using self-play has shown remarkable ...
research
05/30/2017

Multi-Labelled Value Networks for Computer Go

This paper proposes a new approach to a novel value network architecture...

Please sign up or login with your details

Forgot password? Click here to reset