ExIt-OOS: Towards Learning from Planning in Imperfect Information Games

08/30/2018
by   Andy Kitchen, et al.
0

The current state of the art in playing many important perfect information games, including Chess and Go, combines planning and deep reinforcement learning with self-play. We extend this approach to imperfect information games and present ExIt-OOS, a novel approach to playing imperfect information games within the Expert Iteration framework and inspired by AlphaZero. We use Online Outcome Sampling, an online search algorithm for imperfect information games in place of MCTS. While training online, our neural strategy is used to improve the accuracy of playouts in OOS, allowing a learning and planning feedback loop for imperfect information games.

READ FULL TEXT
research
08/30/2018

ExpIt-OOS: Towards Learning from Planning in Imperfect Information Games

The current state of the art in playing many important perfect informati...
research
07/27/2020

Combining Deep Reinforcement Learning and Search for Imperfect-Information Games

The combination of deep reinforcement learning and search at both traini...
research
12/22/2020

Learning to Play Imperfect-Information Games by Imitating an Oracle Planner

We consider learning to play multiplayer imperfect-information games wit...
research
09/30/2021

Scalable Online Planning via Reinforcement Learning Fine-Tuning

Lookahead search has been a critical component of recent AI successes, s...
research
06/14/2019

Problems with the EFG formalism: a solution attempt using observations

We argue that the extensive-form game (EFG) model isn't powerful enough ...
research
09/27/2017

Combining Prediction of Human Decisions with ISMCTS in Imperfect Information Games

Monte Carlo Tree Search (MCTS) has been extended to many imperfect infor...
research
02/06/2021

Improving Model and Search for Computer Go

The standard for Deep Reinforcement Learning in games, following Alpha Z...

Please sign up or login with your details

Forgot password? Click here to reset