In the field of computer gameplay, Monte Carlo Tree Search (MCTS) agents dominate many games of perfect information such as Go, Checkers, Reversi and Connect Four [Browne et al.2012], witness the impressive achievements of the DeepMind team against human players [Silver et al.2016] [Silver et al.2018]. Nonetheless, MCTS-based agents still trail behind their counterparts when playing game positions requiring accurate play. Recent evidence of this fact was provided in game 6 of the 2018 Chess Championship Match between World Champion Magnus Carlsen (White) and challenger Fabiano Caruana (Black), where DeepMind’s AlphaZero missed a mate for Black following a sequence of 30 moves, found by the Sesse supercomputer running this line on the non-MTCS-based Stockfish [Sadler and Regan2019].
A key issue for MCTS’s performance in games like chess is the presence of trap states, where an initial move may look strong, to then be followed by a forcing sequence by the opponent leading to a loss or significant disadvantage [Ramanujan et al.2010]. Despite the breakthroughs of MCTS-based engines, the challenge still remains to equip MCTS with the capability of handling such forcing lines.
The approach we take in this paper is to formulate a generalised notion of similarity between game states to improve the performance of game playing agents by smart search: if a trap was found after position and we now analyse position , which is very similar to , then chances are we still have a trap after . As similar strategies are likely to contain similar move sequences but not necessarily similar board positions, our measures are based on possible moves from each position, rather than the appearance of the board itself.
We study similarity of game positions in two-player, deterministic games of perfect information, by looking at the structure of their local game trees, working with the set of possible moves from each position. We introduce novel similarity measures based on the intersection of move sets, and a structural similarity measure that only considers the arrangement of the local game tree and not the specific moves entailed. We analyse the formal relation of these measures and test them against benchmark problems in chess, with a number of surprising and promising findings. Notably, our structural similarity measure was able to match trap states to their child trap states with 85% accuracy, without using domain-specific knowledge. On top of this, we introduce a move matching algorithm, which accurately pairs moves with similar strategic value from different positions. Our results are of immediate relevance to MCTS adaptations to detect and avoid trap states in game play.
Graph comparison is one of the most fundamental problems of theoretical computer science, with graph isomorphism computation having been an open problem for quite some time [Babai2015]. With tree structures, possibly the most commonly used metric is the edit distance [Zhang and Shasha1989]: based on the number of edits (node insertions, deletions and substitutions) necessary to transform one tree into another, this metric works well for trees of a similar size with many shared nodes and edges. However, it tends to be less suitable when comparing multiple trees of different sizes, as large trees sharing some proportion of their nodes appear further from each other than two completely distinct smaller trees. An alternative measure is the alignment distance [Jiang et al.1995], an adaptation of the edit distance based on the notion of sliding one tree into another and counting the number of edits needed to transform both trees into the combined one. The alignment distance requires lower complexity to compute than the edit distance, but it is technically not a metric and suffers from similar problems comparing trees of different sizes.
In game playing, the presence of forcing continuations is identified as a key problem faced by AI engines, with more acute implications for chess-like games [Ramanujan et al.2010]. Surprisingly though, the theory of similarity metrics to aid strategic decisions in game playing is not well developed.
Similarity measures have instead been used in other areas of AI, as in the case of siamese neural networks for one-shot learning[Koch et al.2015]
. In this case, two symmetric convolutional neural networks were trained on same-different pairs and then shown a test instance as well as one example from each possible classification. The output of the twin networks was then compared using a similarity measure. Here a cross-entropy objective function was used to determine similarity, but this required the networks to be symmetric and weight-tied. New similarity measures based on the structural similarity of networks could remove these requirements, but have not yet been investigated.
Section 2 introduces our formal setup to compare game trees through a number of similarity measures. Section 3 uses these as the basis of a dynamic algorithm to detect structural similarities among subtrees. In Section 4 we compare these against known chess positions. We conclude discussing potential applications and research directions.
2 Positional Similarities
Let be a two-player finite extensive form game of perfect information, where players, e.g., Black and White, alternate moves, with White starting the game. Formally, consists of a set of histories such that is the starting board position and each (with ) can be reached from with a single legal move by White, whenever is even, by Black, otherwise (as in e.g., [Maschler et al.2016]).
We are interested in comparing trees that result from players exploring game continuations from a certain board position on. In MCTS, for example, these are the game trees generated by the expansion step (see e.g., [Sutton and Barto2018]). Let denote tree roots and child nodes (board positions) of and . Then let denote the set of all possible moves from position and the set of all possible moves contained in all possible move sequences of length from position .
We now present three natural measures, of increasing complexity, to establish how similar such trees are: the similarity of continuations, the similarity of sequences and the tree edit similarity. All these measures are model-free, in the sense that they can be used in all situations that can be described as two-player finite extensive games of perfect information. We analyse their formal interrelation in this section and use them as basis of our dynamic algorithm in the subsequent one.
2.1 Similarity of Continuations
Our first measure, which we call similarity of continuations, is calculated from the sets , of 1-ply atomic moves from starting positions to their children of depth . The similarity is the size of the intersection of these two sets divided by the size of their union.
As can be found from a simple expansion of the game trees, computing such a measure takes time , where is the breadth of the game tree.
At depth 1, the similarity of continuations simply calculates the proportion of children that two nodes share. When extended to a deeper search, the measure becomes less fine-grained since a move that occurs at different depths in the trees will still count as shared, and multiple occurrences of the same move are only counted once.
As an example, consider the trees in Figure 1, which have depth-2 continuation sets ,
2.2 Similarity of Sequences
Our second similarity measure, which we call similarity of sequences, uses longer sequences of moves rather than single plies. To ease computation, we require each possible move sequence of length from tree root to be first rewritten according to a predetermined move ordering as a simplified sequence . Formally, two sequences are simplified into one if and only if they are the same modulo move permutation. These simplified sequences are then stored in a structure , which we call the simplified tree of . As different move permutations can create the same simplified sequence, we also store in the multiplicity of each , where corresponds to the number of ways can be reached from the root note. Then the similarity of sequences calculates the ratio of the intersection to the union of the simplified trees.
Let be the multiplicity of simplified sequence in , the multiplicity of in , and , the number of nodes in the larger of , . Then the similarity of sequences of and is given by
Calculating the similarity of sequences at depth 2 on the example trees in Figure 1 is as follows. For an alphabetical ordering, the simplified trees can be written
where the superscript corresponds to the multiplicity of each sequence. Then
The tree simplification can be done in one depth-first pass of each tree, taking time . Calculating the proportion of shared sequences takes , which is equal to in the worst case, the same logarithmic complexity as the similarity of continuations. It should be noted that the tree reduction step means that the complexity coefficient is larger for the similarity of sequences calculation. This is a trade-off for accuracy at depth , as less information is lost when calculating from sequences rather than continuations.
Relation to Kernels
The similarity of sequences is closely related to the Tanimoto similarity measure or kernel [Swamidass et al.2005] [Bajusz et al.2015], based on the intersection over the union of the inner products of two sets. The Tanimoto kernel was successfully used to calculate the similarity of molecule fingerprints in Bioinformatics from the feature map of a molecule, by counting the number of paths through the map shared by different molecules [Swamidass et al.2005]. The methods used in this area can be carried over to extensive form games of perfect information, as a board position can be viewed as a fingerprint representing the game that has gone before it. The game tree and feature map can both be traversed and have their matching paths counted. Using a suffix tree data structure [Ukkonen1995] [Weiner1973], we can compute the Tanimoto kernel in time , for depth , nodes and edges in trees , . The similarity of sequences is also comparable to the random walks kernel [Vishwanathan et al.2010], a measure of similarity between two graphs found by counting the number of random paths they share. The main difference here is that the similarity of sequences has limited depth and is a normalised metric.
2.3 Tree Edit Similarity
It may be the case that is very similar to but differs by some very shallow moves. If this is the case, the similarity of sequences measure would not detect this similarity. We therefore propose a modified version of the tree edit distance [Li and Zhang2011], the tree edit similarity, to compare subtrees, which is normalised, and acts as a metric on the tree edit space. The normalised tree edit distance [Li and Zhang2011] gives values in the range [0,1], and as such would be suitable as a similarity measure when subtracted from 1. The normalised distance is given as
where is the tree edit distance between and , and is the weight of edit operations. Since there is no need to weight edit operations differently, we may take to be 1 for all operations. Then, as shown by Li and Zhang, the formula is valid as a metric. Since calculating the distance between two trees is equivalent to calculating their similarity and subtracting it from 1, we define the tree edit similarity as
Calculating the tree edit similarity on the example trees in Figure 1 is as follows:
This measure is the most fine-grained of the three detailed so far. Since calculating tree edit distance on unordered trees is known to be NP-hard [Touzet2003], we must again order the nodes in a preprocessing step, with complexity as above. Once we have ordered trees, the time complexity reduces to when , and when [Zhang and Shasha1989]. As such, the improvements made by the tree edit similarity over the two previous measures must be weighed against the added complexity.
2.4 Comparing Terminal States
It may sometimes be necessary to find the similarity of two terminal states. In terms of the game tree for a zero-sum game, two terminal nodes should have a value of 1 if they give the same reward for the agent (win-win, draw-draw, lose-lose), and 0 if the reward is different. Since two terminal nodes have no children, their fractional similarity measure is undefined, so we must handle this case separately.
The normalised difference between the rewards of the two terminal nodes can be found by subtracting the reward of one node from the reward of the other, then dividing the result by the size of the range of possible reward values , . This gives a value between 0 and 1, where 1 represents rewards at opposite ends of the range, and 0 represents equal rewards. Subtracting from 1 then gives a similarity measure, formalised as
This can be used in endgame cases to prevent zero errors when calculating other similarity measures.
2.5 Relationship Between Measures
At depth 1, the similarity of sequences and the similarity of continuations are equivalent, as each child move only appears once per tree. At depth 2, the similarity of sequences has greater variation, as can be seen from the following chess-inspired instance.
Example 1 (Chess trees).
Let , be nodes of a chess game tree where branching factor is constant, and , differ only in the placement of two pieces. Then at depth two
Now consider positions , that also differ only in the placement of two pieces, except that in the opponent has chosen a forcing move leading to checkmate at depth 2, while in the opponent has chosen otherwise. Then extends past depth two, but is truncated and only contains depth 1 moves, all of which are shared with . Then at depth two
So we can see that
and thus the similarity of sequences has greater variation than the similarity of continuations. The tree edit similarity is yet more variable than the similarity of sequences, as can be seen from further calculations on the same examples.
Modulo the tradeoff between simplicity and complexity, the above similarity measures can be used to analyse any game trees with a consistent move labelling. This would be especially useful for games with less dynamic trees, that is, those without capturing or blocking moves that change the game tree structurally between plies. For games like Go, with the potential to use one piece to exert power over a whole area, these measures provide useful tools for analysis, which could be further explored by accounting for symmetries and abstractions of the board.
3 Detecting Structural Similarities
We may find ourselves comparing positions that do not share many continuations, e.g., far away from one another in a game tree. What we can then do is to extend the previous approach to recursively check for subtree similarity.
3.1 Structural Similarity Measure
Our final similarity measure, which we call the structural similarity measure, compares the graphical structure of two game trees without comparing their atomic moves directly. The measure is based on calculating the similarity of each starting position to each of its child nodes using any of the three previously defined measures, before comparing this list of similarities to the list of similarities of another starting position to its children . The measure uses an assignment algorithm (see Figure 1) to pair each child node of to a child node of to minimise the sum of the paired nodes’ similarities to their respective parents. If one subtree has more children than the other, each unpaired child adds 1 to this sum. The sum is then divided by the larger number of children and subtracted from 1 to provide the structural similarity of the two subtrees, where a value of 1 is identical and 0 is completely distinct. Let be the number of child nodes of respectively. Then, for a selected similarity measure , the structural similarity measure can be expressed as
The following calculates the structural similarity measure based on the similarity of continuations at depth-1 on the trees in Figure 1. The similarity of each branch to its root is
There are two minimum distance matchings:
and their total distance is 0.683. So
While the structural similarity measure may calculate more accurate similarities between positions, this comes at a cost, as each calculation requires similarity computations of every child node to its parent. When the similarity of sequences or continuations at depth 1 is used as the base measure, on average it takes time to calculate the similarity of all children to their parent. Assigning children in pairs using the Hungarian algorithm takes operations, so the structural similarity algorithm runs in time . To improve the complexity, the measure could be approximated by randomly sampling child nodes and calculating their structural similarity, which warrants further investigation.
As the structural similarity measure pairs moves that are comparably similar to their parent states, this method can be used to pair moves from different board positions that may have similar strategic value. For example, if one position is known to have a killer move in two plies leading to a win for the opponent and this position has a high similarity to a new position, the depth 2 matches can be inspected and the move identified that is most frequently matched to the killer move in the known position, and this move is likely to be a killer move from the new position. We will evaluate the effectiveness of the approach in the forthcoming section.
The structural similarity measure is generalisable to the analysis of any two local trees with self-consistent move labellings, as the measure can be calculated independently of such labels. This means, e.g., that the structure of a local Go tree can be compared to that of a local chess tree, or, alternatively, we can show how a game tree changes through the game.
Calculating how dynamic a game is, in terms of the variability of the connection density of the graph, can be very useful in indicating which gameplay heuristics to use. For example, to use the All-Moves-As-First (AMAF) heuristic, which initially updates sibling nodes with the same estimated value for each move played, an agent first assumes that a move from one node is likely to affect the game in a similar way to the same move played from a sibling node. This may be likely to work on less dynamic games but could be less reliable for highly dynamic games, where the effect of a move on the state of the game is less consistent. Conversely, pruning may be most helpful for highly dynamic games, as these offer a stark contrast between reward values for different branches, which is not necessarily the case for less dynamic games.
These hypotheses are supported by studies of successful AMAF use in the less dynamic games Go [Gelly and Silver2011], Phantom Go [Cazenave2005], Havannah [Teytaud and Teytaud2009] and Morpion Solitaire [Akiyama et al.2010], successful pruning in the dynamic game of Amazons [Lorentz2008] and less successful pruning in Havannah [Teytaud and Teytaud2009].
We tested how effective the first three similarity measures were at detecting nearby trap states in chess, using the similarity of continuations at depth , similarity of sequences at and tree edit similarity at . We chose a sample of 4 distinct trap states which each lead to checkmate within 2 to 4 plies, as shown in Figure 2. We used a sample of all 1000-1500 board positions that were 2 plies away from each trap state, and recorded whether the trap was maintained or not for each new position. The measures were calculated on each of these board positions, as was a cross-correlation measure that was used as a control, calculated by finding the number of squares where piece placement differed and dividing this number by 64. The similarity of sequences was adapted for chess by including captures in the simplified sequences. This adaptation can be generalised to any game with irreversible moves, by recording the irreversible moves from each sequence as well as its standard moves.
|Position||Similarity Measure||False Negatives||False Positives|
Clearly, an effective measure should evaluate trap states as highly similar to the original position with high frequency, so we fixed a threshold value and calculated the proportion of trap and non-trap states with similarity higher than for each measure. For each trap state and each of our similarity measures, when was set to the average value of the similarities, around 70% of all children which were also trap states had above average similarity to the original position, and consistently over 50% of non-trap children had below average similarity. This was not the case for the cross-correlation, where up to 87% of trap states had below average similarity, and 72% of non-trap states had above average similarity. These results can be seen in Table 1.
In general, there was no significant difference between the proportion of false positives (non-traps with above average similarity) and false negatives (traps with below average similarity) given by the similarity of sequences, similarity of continuations and tree edit similarity. However the added time complexity of the similarity of sequences and tree edit similarity at depth 2 was significant. Thus, perhaps surprisingly, the similarity of continuations is effectively better as a heuristic similarity measure for evaluating similarities of closely related board positions than the similarity of sequences.
Finally, for complexity considerations, we tested the structural similarity measure on 5 smaller samples of 40 randomly selected child positions from the first two trap positions. Using this measure, an average of 85% of child trap states had above average structural similarity to the original position. The high complexity of this measure makes it time-intensive to compute, but results clearly show is rather effective at picking out potential trap states from a select sample of positions.
The move matching algorithm was also tested on various chess positions, to detect moves with similar strategic impact. Frequent matchings were assumed to be a more reliable indicator of moves with a similar effect on gameplay, so only the top 5 most frequently matched pairings were assessed.
We tested the matching algorithm on three different samples, each with 6 pairs of board positions, all shown in Table 2. Firstly, we used the algorithm on all traps from the trap detection sample. For all but one of the pairings (Légal and Budapest traps), all of the 5 most frequent matches for each pair comprised 2 decisive or 2 non-decisive moves. In all but one pairing (Caro-Kann and Kieninger traps), the two most frequently paired moves were both checkmate moves. The second sample we used was based on the Légal and Budapest Gambit traps. We compared each trap with a sample of three child positions. This sample comprised one position containing the original trap but a difference in the placement of two pawns; one position where the bishop that had threatened the queen had been captured; and one position that was selected as the best continuation by the Stockfish chess engine. In all but one pairing, all of the top 5 matches comprised 2 decisive or 2 non-decisive moves. All of the most frequently paired moves were both decisive. The third sample was a selection of positions from the 2016 World Championship match between Magnus Carlsen and Sergey Karjakin, which appeared after 10, 20, 30 and 40 plies. An average of 4 of the top 5 matches for each pairing comprised 2 decisive or 2 non-decisive moves. Three of the most frequently paired moves were both check moves, and one of them comprised two equivalently unimpactful moves of the king. This sample provided less reliable pairings than the previous two samples, possibly because its positions had a more varied strategic impact than those of the other samples.
These results show that the move matching algorithm is fairly well suited to finding similarly decisive moves from different board positions, and thus useful in detecting possible trap states and sacrificial moves from the game tree structure without evaluating board positions.
|Board Position Pairings||Equally Decisive Top 5 Pairs||Top Match|
|Légal, Budapest||4||Bf7#, Bf2#|
|Légal, Kieninger||5||Bf7#, Nf3#|
|Légal, Caro-Kann||5||Bf7#, ef7#|
|Budapest, Kieninger||5||Bf2#, Nd3#|
|Budapes, Caro-Kann||5||Bf2#, ef7#|
|Caro-Kann, Kieninger||5||ef7#, Nf3+|
|Budapest, Budapest+ 5. b3 a6||5||Bf2#, Bf2#|
|Budapest, Budapest+ 5. f3 Nxe5||3||Bf2#, Nf3+|
|Budapest, Budapest+ 5. Bd2 Qh4||5||Bf2#, Bd2+|
|Légal, Légal+ … h6 6. a3||5||Bf7#, Bf7#|
|Légal, Légal+ … h6 6. Nxg4||5||Bf7#, Bf7+|
|Légal, Légal+ … Nxe5 6. Be2||5||Bf7#, Bb5+|
|Carlsen-Karjakin Move 10, Move 20||5||Be6, Qe8|
|Carlsen-Karjakin Move 10, Move 30||5||Qc3+, Rg5+|
|Carlsen-Karjakin Move 10, Move 40||5||Qc3+, Rg5+|
|Carlsen-Karjakin Move 20, Move 30||4||Nd4, Rg5+|
|Carlsen-Karjakin Move 20, Move 40||3||Kf8, Kh8|
|Carlsen-Karjakin Move 30, Move 40||3||Rg5+, Qg6+|
5.1 AMAF/RAVE Adaptation
Past papers [Helmbold and Parker-Wood2009] have shown that MCTS displays a marked improvement when using adaptations such as All-Moves-As-First (AMAF), Rapid Action Value Estimation (RAVE) and Permutation-AMAF. Such adaptations update multiple areas of the game tree at once, where one move is available from many positions (as in AMAF) or where one board position is a permutation of another, on the assumption that the equivalent move from each of these positions will have the same strategic impact on gameplay. We envisage the effective use of a similarity measure when choosing which equivalent positions to update, as this may lead to more effective trap detection than that of MCTS or its AMAF adaptations. We suggest adding a similarity measure to two MCTS adaptations: the killer heuristic, where decisive moves are evaluated first, and killer RAVE, which only applies RAVE to decisive moves [Lorentz2010]. MCTS may more quickly detect a trap ahead when combined with these similarity-based adaptations.
5.2 Wider Game Strategy and Graph Applications
Many modern AI programs use deep learning to recognise tactical patterns from shapes of features in the field of play. It seems natural to use this learning strategy to group atomic moves by their tactical value, to then create an abstracted game tree with a lower branching factor than the original tree. The structural similarity measure can then be used to detect tactical moves representing equivalent strategies, giving the agent options once it has chosen its desired strategy.
In cases where an agent is trained to predict the moves a human player would make, as was the case for AlphaGo [Silver et al.2016]
, the modified AMAF/RAVE adaptation above can be used to prime the neural network and update predictions for multiple positions at once. This may lead to opportunities for faster reinforcement learning or more efficient learning from smaller data sets.
We presented four similarity measures for game positions in two-player, deterministic games of perfect information, based on their game trees with no domain-specific knowledge. We tested the measures on chess and suggested their use in heuristics for MCTS-based agents, noting their application to a range of graphical problems. We showed that, using our first two similarity measures, an average of around 70% of chess positions occurring 2 plies after a trap state that were also traps had above average similarity to the original position. This figure rose to 85% using the structural similarity measure. We also showed that our move matching algorithm consistently paired moves with similar strategic value from different starting positions. We believe this can aid MCTS agents in finding equally decisive moves within different areas of the game tree, as well as detecting new trap states.
[Akiyama et al.2010]
Haruhiko Akiyama, Kanako Komiya, and Yoshiyuki Kotani.
Nested monte-carlo search with amaf heuristic.
2010 International Conference on Technologies and Applications of Artificial Intelligence, 2010.
- [Babai2015] László Babai. Graph isomorphism in quasipolynomial time. CoRR, abs/1512.03547, 2015.
- [Bajusz et al.2015] Dávid Bajusz, Anita Rácz, and Károly Héberger. Why is tanimoto index an appropriate choice for fingerprint-based similarity calculations? Journal of cheminformatics, 7(1):20, 2015.
- [Browne et al.2012] Cameron B Browne, Edward Powley, Daniel Whitehouse, Simon M Lucas, Peter I Cowling, Philipp Rohlfshagen, Stephen Tavener, Diego Perez, Spyridon Samothrakis, and Simon Colton. A survey of monte carlo tree search methods. IEEE Transactions on Computational Intelligence and AI in games, 4(1):1–43, 2012.
- [Cazenave2005] Tristan Cazenave. A phantom-go program. In Advances in Computer Games, pages 120–125. Springer, 2005.
- [Gelly and Silver2011] Sylvain Gelly and David Silver. Monte-carlo tree search and rapid action value estimation in computer go. Artificial Intelligence, 175(11):1856–1875, 2011.
- [Helmbold and Parker-Wood2009] David P Helmbold and Aleatha Parker-Wood. All-moves-as-first heuristics in monte-carlo go. In IC-AI, pages 605–610, 2009.
- [Jiang et al.1995] Tao Jiang, Lusheng Wang, and Kaizhong Zhang. Alignment of trees - an alternative to tree edit. Theoretical Computer Science, 143(1):137–148, 1995.
- [Koch et al.2015] Gregory Koch, Richard Zemel, and Ruslan Salakhutdinov. Siamese neural networks for one-shot image recognition. In ICML Deep Learning Workshop, volume 2, 2015.
- [Li and Zhang2011] Yujian Li and Chenguang Zhang. A metric normalization of tree edit distance. Frontiers of Computer Science in China, 5(1):119–125, 2011.
- [Lorentz2008] Richard J Lorentz. Amazons discover monte-carlo. Springer, 2008.
- [Lorentz2010] Richard J Lorentz. Improving monte–carlo tree search in havannah. In International Conference on Computers and Games, pages 105–115. Springer, 2010.
- [Maschler et al.2016] Michael Maschler, Eilon Solan, and Shmuel Zamir. Game Theory. Cambridge University Press, 2016.
- [Ramanujan et al.2010] Raghuram Ramanujan, Ashish Sabharwal, and Bart Selman. On adversarial search spaces and sampling-based planning. In ICAPS, volume 10, pages 242–245, 2010.
- [Sadler and Regan2019] Matthew Sadler and Natasha Regan. Game Changer: AlphaZero’s Groundbreaking Chess Strategies and the Promise of AI. New in Chess, 2019.
- [Silver et al.2016] David Silver, Aja Huang, Chris J Maddison, Arthur Guez, Laurent Sifre, George Van Den Driessche, Julian Schrittwieser, Ioannis Antonoglou, Veda Panneershelvam, Marc Lanctot, et al. Mastering the game of go with deep neural networks and tree search. Nature, 529(7587):484, 2016.
- [Silver et al.2018] David Silver, Thomas Hubert, Julian Schrittwieser, Ioannis Antonoglou, Matthew Lai, Arthur Guez, Marc Lanctot, Laurent Sifre, Dharshan Kumaran, Thore Graepel, Timothy Lillicrap, Karen Simonyan, and Demis Hassabis. A general reinforcement learning algorithm that masters chess, shogi, and go through self-play. Science, 362(6419):1140–1144, 2018.
- [Sutton and Barto2018] Richard S. Sutton and Andrew G. Barto. Reinforcement learning - an introduction. 2nd Edition. MIT Press, 2018.
- [Swamidass et al.2005] S Joshua Swamidass, Jonathan Chen, Jocelyne Bruand, Peter Phung, Liva Ralaivola, and Pierre Baldi. Kernels for small molecules and the prediction of mutagenicity, toxicity and anti-cancer activity. Bioinformatics, 21(suppl_1):i359–i368, 2005.
- [Teytaud and Teytaud2009] Fabien Teytaud and Olivier Teytaud. Creating an upper-confidence-tree program for havannah. In Advances in Computer Games, pages 65–74. Springer, 2009.
- [Touzet2003] Hélène Touzet. Tree edit distance with gaps. Information Processing Letters, 85(3):123–129, 2003.
- [Ukkonen1995] Esko Ukkonen. On-line construction of suffix trees. Algorithmica, 14(3):249–260, 1995.
[Vishwanathan et al.2010]
S Vichy N Vishwanathan, Nicol N Schraudolph, Risi Kondor, and Karsten M
Journal of Machine Learning Research, 11(Apr):1201–1242, 2010.
Linear pattern matching algorithms.In Switching and Automata Theory, 1973. SWAT’08. IEEE Conference Record of 14th Annual Symposium on, pages 1–11. IEEE, 1973.
- [Zhang and Shasha1989] Kaizhong Zhang and Dennis Shasha. Simple fast algorithms for the editing distance between trees and related problems. SIAM journal on computing, 18(6):1245–1262, 1989.