Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm

12/05/2017
by   David Silver, et al.
0

The game of chess is the most widely-studied domain in the history of artificial intelligence. The strongest programs are based on a combination of sophisticated search techniques, domain-specific adaptations, and handcrafted evaluation functions that have been refined by human experts over several decades. In contrast, the AlphaGo Zero program recently achieved superhuman performance in the game of Go, by tabula rasa reinforcement learning from games of self-play. In this paper, we generalise this approach into a single AlphaZero algorithm that can achieve, tabula rasa, superhuman performance in many challenging domains. Starting from random play, and given no domain knowledge except the game rules, AlphaZero achieved within 24 hours a superhuman level of play in the games of chess and shogi (Japanese chess) as well as Go, and convincingly defeated a world-champion program in each case.

READ FULL TEXT

page 1

page 2

page 3

page 4

07/02/2022

An AlphaZero-Inspired Approach to Solving Search Problems

AlphaZero and its extension MuZero are computer programs that use machin...
05/18/2018

Solving the Rubik's Cube Without Human Knowledge

A generally intelligent agent must be able to teach itself how to solve ...
09/11/2018

SAI, a Sensible Artificial Intelligence that plays Go

We propose a multiple-komi modification of the AlphaGo Zero/Leela Zero p...
07/12/2016

Automatic Bridge Bidding Using Deep Reinforcement Learning

Bridge is among the zero-sum games for which artificial intelligence has...
10/31/2022

DanZero: Mastering GuanDan Game with Reinforcement Learning

Card game AI has always been a hot topic in the research of artificial i...
09/22/2015

Poker-CNN: A Pattern Learning Strategy for Making Draws and Bets in Poker Games

Poker is a family of card games that includes many variations. We hypoth...
01/10/1999

KnightCap: A chess program that learns by combining TD(lambda) with game-tree search

In this paper we present TDLeaf(lambda), a variation on the TD(lambda) a...

Code Repositories

chess-alpha-zero

Chess reinforcement learning by AlphaGo Zero methods.


view repo

every-single-day-i-tldr

Every day, I'm adding all the web links I've read and found useful or interesting.


view repo

Please sign up or login with your details

Forgot password? Click here to reset