An Efficient Dynamic Sampling Policy For Monte Carlo Tree Search

04/26/2022
by   Gongbo Zhang, et al.
0

We consider the popular tree-based search strategy within the framework of reinforcement learning, the Monte Carlo Tree Search (MCTS), in the context of finite-horizon Markov decision process. We propose a dynamic sampling tree policy that efficiently allocates limited computational budget to maximize the probability of correct selection of the best action at the root node of the tree. Experimental results on Tic-Tac-Toe and Gomoku show that the proposed tree policy is more efficient than other competing methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/07/2020

Bayesian Optimized Monte Carlo Planning

Online solvers for partially observable Markov decision processes have d...
research
05/15/2018

Feedback-Based Tree Search for Reinforcement Learning

Inspired by recent successes of Monte-Carlo tree search (MCTS) in a numb...
research
01/30/2019

Learning Position Evaluation Functions Used in Monte Carlo Softmax Search

This paper makes two proposals for Monte Carlo Softmax Search, which is ...
research
06/08/2020

Monte Carlo Tree Search guided by Symbolic Advice for MDPs

In this paper, we consider the online computation of a strategy that aim...
research
09/13/2020

Monte Carlo Tree Search Based Tactical Maneuvering

In this paper we explore the application of simultaneous move Monte Carl...
research
09/06/2018

How to Combine Tree-Search Methods in Reinforcement Learning

Finite-horizon lookahead policies are abundantly used in Reinforcement L...
research
12/23/2019

Monte-Carlo Tree Search for Policy Optimization

Gradient-based methods are often used for policy optimization in deep re...

Please sign up or login with your details

Forgot password? Click here to reset