CH-Go: Online Go System Based on Chunk Data Storage

03/22/2023
by   H. Lu, et al.
0

The training and running of an online Go system require the support of effective data management systems to deal with vast data, such as the initial Go game records, the feature data set obtained by representation learning, the experience data set of self-play, the randomly sampled Monte Carlo tree, and so on. Previous work has rarely mentioned this problem, but the ability and efficiency of data management systems determine the accuracy and speed of the Go system. To tackle this issue, we propose an online Go game system based on the chunk data storage method (CH-Go), which processes the format of 160k Go game data released by Kiseido Go Server (KGS) and designs a Go encoder with 11 planes, a parallel processor and generator for better memory performance. Specifically, we store the data in chunks, take the chunk size of 1024 as a batch, and save the features and labels of each chunk as binary files. Then a small set of data is randomly sampled each time for the neural network training, which is accessed by batch through yield method. The training part of the prototype includes three modules: supervised learning module, reinforcement learning module, and an online module. Firstly, we apply Zobrist-guided hash coding to speed up the Go board construction. Then we train a supervised learning policy network to initialize the self-play for generation of experience data with 160k Go game data released by KGS. Finally, we conduct reinforcement learning based on REINFORCE algorithm. Experiments show that the training accuracy of CH- Go in the sampled 150 games is 99.14 accuracy in the test set is as high as 98.82 local computing power and time, we have achieved a better level of intelligence. Given the current situation that classical systems such as GOLAXY are not free and open, CH-Go has realized and maintained complete Internet openness.

READ FULL TEXT

page 6

page 7

research
07/02/2019

Playing Go without Game Tree Search Using Convolutional Neural Networks

The game of Go has a long history in East Asian countries, but the field...
research
05/30/2020

Manipulating the Distributions of Experience used for Self-Play Learning in Expert Iteration

Expert Iteration (ExIt) is an effective framework for learning game-play...
research
07/12/2022

Online Game Level Generation from Music

Game consists of multiple types of content, while the harmony of differe...
research
05/13/2021

Adaptive Warm-Start MCTS in AlphaZero-like Deep Reinforcement Learning

AlphaZero has achieved impressive performance in deep reinforcement lear...
research
05/31/2021

Supervised learning and tree search for real-time storage allocation in Robotic Mobile Fulfillment Systems

A Robotic Mobile Fulfillment System is a robotised parts-to-picker syste...
research
07/30/2018

Improving Spatiotemporal Self-Supervision by Deep Reinforcement Learning

Self-supervised learning of convolutional neural networks can harness la...
research
01/25/2019

Evaluation Function Approximation for Scrabble

The current state-of-the-art Scrabble agents are not learning-based but ...

Please sign up or login with your details

Forgot password? Click here to reset