Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm

12/05/2017
by   David Silver, et al.
0

The game of chess is the most widely-studied domain in the history of artificial intelligence. The strongest programs are based on a combination of sophisticated search techniques, domain-specific adaptations, and handcrafted evaluation functions that have been refined by human experts over several decades. In contrast, the AlphaGo Zero program recently achieved superhuman performance in the game of Go, by tabula rasa reinforcement learning from games of self-play. In this paper, we generalise this approach into a single AlphaZero algorithm that can achieve, tabula rasa, superhuman performance in many challenging domains. Starting from random play, and given no domain knowledge except the game rules, AlphaZero achieved within 24 hours a superhuman level of play in the games of chess and shogi (Japanese chess) as well as Go, and convincingly defeated a world-champion program in each case.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/02/2022

An AlphaZero-Inspired Approach to Solving Search Problems

AlphaZero and its extension MuZero are computer programs that use machin...
research
05/18/2018

Solving the Rubik's Cube Without Human Knowledge

A generally intelligent agent must be able to teach itself how to solve ...
research
09/11/2018

SAI, a Sensible Artificial Intelligence that plays Go

We propose a multiple-komi modification of the AlphaGo Zero/Leela Zero p...
research
07/12/2016

Automatic Bridge Bidding Using Deep Reinforcement Learning

Bridge is among the zero-sum games for which artificial intelligence has...
research
10/31/2022

DanZero: Mastering GuanDan Game with Reinforcement Learning

Card game AI has always been a hot topic in the research of artificial i...
research
09/22/2015

Poker-CNN: A Pattern Learning Strategy for Making Draws and Bets in Poker Games

Poker is a family of card games that includes many variations. We hypoth...
research
01/10/1999

KnightCap: A chess program that learns by combining TD(lambda) with game-tree search

In this paper we present TDLeaf(lambda), a variation on the TD(lambda) a...

Please sign up or login with your details

Forgot password? Click here to reset