TDLeaf(lambda): Combining Temporal Difference Learning with Game-Tree Search

01/05/1999
by   Jonathan Baxter, et al.
0

In this paper we present TDLeaf(lambda), a variation on the TD(lambda) algorithm that enables it to be used in conjunction with minimax search. We present some experiments in both chess and backgammon which demonstrate its utility and provide comparisons with TD(lambda) and another less radical variant, TD-directed(lambda). In particular, our chess program, "KnightCap," used TDLeaf(lambda) to learn its evaluation function while playing on the Free Internet Chess Server (FICS, fics.onenet.net). It improved from a 1650 rating to a 2100 rating in just 308 games. We discuss some of the reasons for this success and the relationship between our results and Tesauro's results in backgammon.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/10/1999

KnightCap: A chess program that learns by combining TD(lambda) with game-tree search

In this paper we present TDLeaf(lambda), a variation on the TD(lambda) a...
research
08/09/2012

Experiments with Game Tree Search in Real-Time Strategy Games

Game tree search algorithms such as minimax have been used with enormous...
research
02/13/2020

Zero-Rating and Net Neutrality: Who Wins, Who Loses?

An objective of network neutrality is that the design of regulations for...
research
03/27/2023

Paired comparisons for games of chance

We present a Bayesian rating system based on the method of paired compar...
research
04/05/2014

MTD(f), A Minimax Algorithm Faster Than NegaScout

MTD(f) is a new minimax search algorithm, simpler and more efficient tha...
research
08/21/2020

DApp for Rating

Lots of existing web applications include a component for rating interne...
research
10/20/2020

Elo-MOV rating algorithm: Generalization of the Elo algorithm by modelling the discretized Margin of Victory

In this work we develop a new algorithm for rating of teams (or players)...

Please sign up or login with your details

Forgot password? Click here to reset