Supervised and Reinforcement Learning from Observations in Reconnaissance Blind Chess

08/03/2022
by   Timo Bertram, et al.
0

In this work, we adapt a training approach inspired by the original AlphaGo system to play the imperfect information game of Reconnaissance Blind Chess. Using only the observations instead of a full description of the game state, we first train a supervised agent on publicly available game records. Next, we increase the performance of the agent through self-play with the on-policy reinforcement learning algorithm Proximal Policy Optimization. We do not use any search to avoid problems caused by the partial observability of game states and only use the policy network to generate moves when playing. With this approach, we achieve an ELO of 1330 on the RBC leaderboard, which places our agent at position 27 at the time of this writing. We see that self-play significantly improves performance and that the agent plays acceptably well without search and without making assumptions about the true game state.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/30/2018

Application of Self-Play Reinforcement Learning to a Four-Player Game of Imperfect Information

We introduce a new virtual environment for simulating a card game known ...
research
02/17/2018

A Deep Q-Learning Agent for the L-Game with Variable Batch Training

We employ the Deep Q-Learning algorithm with Experience Replay to train ...
research
02/23/2023

Targeted Search Control in AlphaZero for Effective Policy Improvement

AlphaZero is a self-play reinforcement learning algorithm that achieves ...
research
08/11/2020

HEX and Neurodynamic Programming

Hex is a complex game with a high branching factor. For the first time H...
research
12/22/2021

Alpha-Mini: Minichess Agent with Deep Reinforcement Learning

We train an agent to compete in the game of Gardner minichess, a downsiz...
research
10/15/2019

Visual Hide and Seek

We train embodied agents to play Visual Hide and Seek where a prey must ...
research
05/16/2023

RAMario: Experimental Approach to Reptile Algorithm – Reinforcement Learning for Mario

This research paper presents an experimental approach to using the Repti...

Please sign up or login with your details

Forgot password? Click here to reset