Monte-Carlo Tree Search as Regularized Policy Optimization

07/24/2020
by   Jean-Bastien Grill, et al.
5

The combination of Monte-Carlo tree search (MCTS) with deep reinforcement learning has led to significant advances in artificial intelligence. However, AlphaZero, the current state-of-the-art MCTS algorithm, still relies on handcrafted heuristics that are only partially understood. In this paper, we show that AlphaZero's search heuristics, along with other common ones such as UCT, are an approximation to the solution of a specific regularized policy optimization problem. With this insight, we propose a variant of AlphaZero which uses the exact solution to this policy optimization problem, and show experimentally that it reliably outperforms the original algorithm in multiple domains.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

05/14/2020

Solve Traveling Salesman Problem by Monte Carlo Tree Search and Deep Neural Network

We present a self-learning approach that combines deep reinforcement lea...
06/01/2019

Automated Machine Learning with Monte-Carlo Tree Search (Extended Version)

The AutoML task consists of selecting the proper algorithm in a machine ...
02/03/2022

On Monte Carlo Tree Search for Weighted Vertex Coloring

This work presents the first study of using the popular Monte Carlo Tree...
09/09/2015

A Topological Approach to Meta-heuristics: Analytical Results on the BFS vs. DFS Algorithm Selection Problem

Search is a central problem in artificial intelligence, and BFS and DFS ...
02/14/2021

Costly Features Classification using Monte Carlo Tree Search

We consider the problem of costly feature classification, where we seque...
07/28/2021

Monte Carlo Tree Search for high precision manufacturing

Monte Carlo Tree Search (MCTS) has shown its strength for a lot of deter...
07/17/2018

Preference-Based Monte Carlo Tree Search

Monte Carlo tree search (MCTS) is a popular choice for solving sequentia...

Code Repositories

AlphaGPU

Alphazero on GPU thanks to CUDA.jl


view repo

othello-nnue

A NNUE Othello engine


view repo

synthesis

A rust implementation of AlphaZero algorithm


view repo
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.