Monte Carlo Tree Search with Sampled Information Relaxation Dual Bounds

04/20/2017
by   Daniel R. Jiang, et al.
0

Monte Carlo Tree Search (MCTS), most famously used in game-play artificial intelligence (e.g., the game of Go), is a well-known strategy for constructing approximate solutions to sequential decision problems. Its primary innovation is the use of a heuristic, known as a default policy, to obtain Monte Carlo estimates of downstream values for states in a decision tree. This information is used to iteratively expand the tree towards regions of states and actions that an optimal policy might visit. However, to guarantee convergence to the optimal action, MCTS requires the entire tree to be expanded asymptotically. In this paper, we propose a new technique called Primal-Dual MCTS that utilizes sampled information relaxation upper bounds on potential actions, creating the possibility of "ignoring" parts of the tree that stem from highly suboptimal choices. This allows us to prove that despite converging to a partial decision tree in the limit, the recommended action from Primal-Dual MCTS is optimal. The new approach shows significant promise when used to optimize the behavior of a single driver navigating a graph while operating on a ride-sharing platform. Numerical experiments on a real dataset of 7,000 trips in New Jersey suggest that Primal-Dual MCTS improves upon standard MCTS by producing deeper decision trees and exhibits a reduced sensitivity to the size of the action space.

READ FULL TEXT
research
11/29/2020

Monte Carlo Tree Search for a single target search game on a 2-D lattice

Monte Carlo Tree Search (MCTS) is a branch of stochastic modeling that u...
research
06/08/2021

Measurable Monte Carlo Search Error Bounds

Monte Carlo planners can often return sub-optimal actions, even if they ...
research
03/03/2013

Top-down particle filtering for Bayesian decision trees

Decision tree learning is a popular approach for classification and regr...
research
11/17/2020

TreeGen – a Monte Carlo generator for data frames

The typical problem in Data Science is creating a structure that encodes...
research
03/21/2021

Dual Monte Carlo Tree Search

AlphaZero, using a combination of Deep Neural Networks and Monte Carlo T...
research
08/09/2014

Selecting Computations: Theory and Applications

Sequential decision problems are often approximately solvable by simulat...
research
09/07/2018

Monte Carlo Tree Search with Scalable Simulation Periods for Continuously Running Tasks

Monte Carlo Tree Search (MCTS) is particularly adapted to domains where ...

Please sign up or login with your details

Forgot password? Click here to reset