The Many AI Challenges of Hearthstone

07/15/2019 ∙ by Amy K. Hoover, et al. ∙ New Jersey Institute of Technology 4

Games have benchmarked AI methods since the inception of the field, with classic board games such as Chess and Go recently leaving room for video games with related yet different sets of challenges. The set of AI problems associated with video games has in recent decades expanded from simply playing games to win, to playing games in particular styles, generating game content, modeling players etc. Different games pose very different challenges for AI systems, and several different AI challenges can typically be posed by the same game. In this article we analyze the popular collectible card game Hearthstone (Blizzard 2014) and describe a varied set of interesting AI challenges posed by this game. Collectible card games are relatively understudied in the AI community, despite their popularity and the interesting challenges they pose. Analyzing a single game in-depth in the manner we do here allows us to see the entire field of AI and Games through the lens of a single game, discovering a few new variations on existing research topics.



There are no comments yet.


page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

For decades classic board games such as Chess, Checkers, and Go have dominated the landscape of AI and games research. Often called the “drosophila of AI” in reference to the drosophila fly’s significance in biological research, Chess in particular has been the subject of hundreds of academic papers and decades of research (Ensmenger, 2012). At the core of many of these approaches is designing algorithms to beat top human players. However, despite IBM’s Deep Blue defeating Garry Kasparov in the 1997 World Chess Championships and DeepMind’s AlphaGo defeating Lee Sedol in the 2016 Google DeepMind Challenge Match (Silver et al., 2016), such programs have yet to exhibit the general intelligence hoped for when such benchmarks were originally proposed.

Since these victories, the AI and games community has gradually shifted focus to challenges in digital games. AI agents can outperform human players in simulated arcade games, sometimes beating them when a win condition is possible (Mnih et al., 2015; Bellemare et al., 2013). Agents and can also outperform humans in real time strategy games (RTSs) like Starcraft II (Vinyals et al., 2019) and multiplayer online battle arena games (MOBAs) like Defense of the Ancients 2 (OpenAI, 2018). While there is value in designing algorithms to win (e.g. the popularity of minimax and alpha-beta pruning algorithms (Shannon, 1950; Turing, 1953) and Monte Carlo Tree Search (MCTS) (Coulom, 2006; Silver et al., 2016)), like the drosphila fly necessarily shapes biological research it is possible that such focus limits the types of problems that can be solved in AI. Generally such agents only play the particular game they were built or trained to play.

However, both digital and analog games pose a variety challenges for which a variety of AI-based methods have been developed in response (Yannakakis and Togelius, 2018; Lucas, 2009; Browne, 2011; Cai and Wunsch, 2007; Kasparov, 2017) Research field of AI and games focuses not only on playing to win, but on modeling player behavior, modeling player experience, generating content and many other challenges. There are many reasons people play games beyond winning, and consequently many AI challenges present in any suitably rich game environment (Yee, 2006).

Rather than the typical approach of isolating particular challenges in artificial intelligence and solving them through exploration of a digital game, this paper instead discusses the multiple AI challenges posed by the popular collectible card game called Hearthstone. While there are other digitial collectible cards games like Gwent , The Elder Scrolls:Legends , and Clash of Clans , Hearthstone is selected for its popularity with over 100 million players. While it is perhaps most common to consider such a game from a particular angle (e.g. playing to win it or modeling its players), this paper presents a kaleidoscopic view of the AI challenges the game presents and some of the current approaches to addressing them. We do not intend this to be an exhaustive survey of AI Challenges in Hearthstone, but to be representative; the AI challenges covered are likely to be closely related to some challenge already described for other games.

Hearthstone is a game rich in AI challenges and relatively unexplored, perhaps more than most games, given the many facets of the game. We also think that most of the research challenges identified here carry over to other collectible card games, and to some extent games that include elements of these games, such as deck building. Still, we believe that many other games have a rich diversity of AI challenges, far more than are usually considered, if you just look. This paper can therefore also serve as a paradigm for papers elucidating and cataloguing AI challenges in other types of games.

2 The Challenges of Hearthstone

Like many traditional board games, Hearthstone (Blizzard, 2014) is a two-player, turn-taking, adversarial game. However, unlike these games, it contains a large amount of stochasticity in play and partial information. The goal of this digital collectible card game is through playing different types of cards, decrease the opponents health from thirty to zero. Players initially choose one of nine different heroes, which will determine the types of cards that the player can access. They then build decks of thirty cards from over 1900 which will be available in any given game. Each card costs the player a certain amount of a resource called mana, and has other attributes like attack, health, or spells it can cast.

While often combined by players in the way they discuss the game, at its core Hearthstone can be divided into two related challenges: 1) playing the game and 2) selecting decks to be played in matches. When describing their overall approach to others, players often refer to their deck archetype and hero, which is inherently packaged with heuristics to effectively play against other players. For example, the Odd Paladin is at the time of writing a powerful type of Paladin deck built around a particular card called Baku the Mooneater, which gets more powerful if all of the other cards in the deck cost an odd amount of mana to play. The deck favors an an aggressive gameplay strategy where the player focuses on destroying the enemy hero as quickly as possible rather than controlling the board or relying on clever card combinations. Each new expansion of Hearthstone changes the set of possible cards from which a player can build decks, that can in turn result in new sometimes more nuanced ways of play. So even when exploring playing to win, as the game is updated, so are the decks and strategies.

The following sections enumerate and outline AI challenges in Hearthstone, including playing in Section 3, building decks in Section 4, helping human players learn to play and build decks in Section 5, and helping designers build the game in Section 6. These ideas are then combined to explore how the field of AI can be taught by approaches addressing these challenges in Hearthstone. But first we will outline the characteristics of Hearthstone that delineate the challenges that the game poses.

2.1 Characteristics of Hearthstone

As proposed by Elias et al. (2012)

, analyzing a game based on its characteristics is an important way to understand the challenges it poses. Some of the most salient characteristics of Hearthstone from the perspective of gameplaying agents are the following: the game has discrete inputs and outputs, meaning that the state observation is discrete and so is the action (i.e. what card to play and potentially its target). The observable game state has a natural and simple structured form (cards on the table and in hand), meaning that playing from pixels is unnecessary as it gives no new information and only adds an arbitrary computer vision problem. The branching factor is variable depending on game state, but generally high if one considers all the actions a player can take in a turn (five cards that can all be played in one turn, where each card can have one of five targets, lead to a branching factor of 375000; but it is not uncommon with game states where the player can choose only whether to play a single card or do nothing at all).

The game is considerably impacted by hidden information (partial observability), both in that the player does not know the opponent’s hand and which cards will be drawn next from the deck. While some information can be observed or predicted (e.g. how long a particular card has been in an opponent’s hand), only guesses can be made based on a priori knowledge of the deck or current known successful strategies. The game is also strongly impacted by stochasticity, both in the form of the initial shuffling of the decks and in the form of randomized effects of certain cards (for example, cards may deal variable damage or hit randomized enemies). The discrete and well-defined nature of the game makes it possible to build simulators of the game that can be executed at much faster than real-time, meaning that we can have fast forward models, which opens up for a large range of search- and planning-based methods for gameplaying. Finally, the deck is naturally separated into deck building, which nicely maps into what in most games is called strategy, and playing the decks, which we can call tactics. While these characteristics are the most salient for gameplaying agents, many of the challenges described in this paper are not primarily about playing the game (i.e. deciding which cards to play and when), so other characteristics will be discussed below.

The collectible card game Magic: The Gathering (of the Coast, 1993) shares many characteristics and challenges of Hearthstone: deck building, complex strategies, hidden information, large search space, etc. However this paper focuses on Hearthstone in particular because of its simplified mechanics and larger online player base. Such a player base currently means that there are more robust tools, simulators, and aggregated data like replays from While the higher complexity of mechanics in Magic: The Gathering poses interesting challenges, Hearthstone is a more accessible, state-of-the-art competitor.

2.2 Research Tools

Hearthstone is supported by an active Hearthsim community dedicated to building and maintaining simulators and other tools to help players strategize and study the game 111 Created and maintained primarily by darkfriend77, Sabberstone 222 is a fully functioning game simulator with at least fourteen contributors. The developers support research initiatives through their software and subreddit 333A subreddit is a sub-forum on the website Reddit. Each subreddit is dedicated to a specific topic.. Other simulators like Metastone 444 and Spellsource 555 have a fully functioning GUI for human players to play games.

3 Playing the Game

Different games pose different challenges for playing. Some games are about long-term planning, while others are about quick reactions or estimating hidden information. The type of challenges posed to the gameplaying algorithm also depends on what kind of information and affordances are offered, e.g. whether information is presented as pure pixels or as information about objects, whether there is training time for the agent, and whether there is a fast forward model available. But the type of game-playing challenge offered also depends on how the game should be played. Apart from playing a game to win, there are other challenges, such as playing a game in a particular style, or creating heuristics that allow human players to learn to play the game by condensing knowledge about how to play it to a small set of rules.

3.1 Playing to Win

From a gameplaying perspective, Hearthstone offers a rare combination of challenges. It is a two-player, turn-based, and adversarial game, much like Chess, Go, and similar classic board games. However, like Poker it contains a substantial amount of hidden information; knowing which cards the opponent has in hand offers a considerable advantage, and good players spend significant effort trying to predict hands (Bursztein, 2016). Tools like Predictor, which is a plugin for a third party data aggregator Hearthstone Deck Tracker 666

exist to help players computationally determine these probabilities. Like many games, Hearthstone features stochasticity. An important source of stochasticity is the initial shuffling of the deck; some types of deck revolve around a particular card (e.g. Cthun, a very powerful high-mana card) which may by drawn early or late depending on the shuffling; additionally, many individual cards have stochastic effects (e.g. the Knife Juggler which randomly attacks one of the opponent’s cards).

Most of the published academic work on Hearthstone to date focuses on methods for playing the game (Stiegler et al., 2017; Świechowski et al., 2018; Santos et al., 2017; Janusz et al., 2017; Zhang and Buro, 2017); in addition, there are a few papers about the closely related challenge of playing Magic (Ward and Cowling, 2009). Also, the several open-source simulators of Hearthstone mentioned previously are packaged with their own gameplaying agents. Most of the published work builds on Monte Carlo Tree Search (MCTS), a stochastic forward planning algorithm initially developed for Go but which has since seen much wider usage, and seeks to find ways the algorithm can be made to work with the game (Santos et al., 2017; Świechowski et al., 2018; Zopf, 2015). A key problem for tree search approaches is how to deal with that the agent does not know the opponent’s hand. This missing information makes it impossible to expand the search tree based on the opponent’s move to do a minimax search, unless a good guess of what their hand might be is available. Some of the work has therefore focused on learning predictive models of the opponent’s hand (Janusz et al., 2017; Dockhorn et al., 2018). Other agents, such as that which is part of MetaStone, simply searches up until the end of the current move and uses a heuristic evaluation function, not even attempting to predict the opponent’s move.

It is worth noting that all the published work on Hearthstone assumes that a fast simulator of some kind is available, which is an easy assumption to make because there are several. However, it is also possible to remove this assumption, and try to learn agents that play the game well without relying on search. This turns the problem into a reinforcement learning problem, where RL methods can potentially train neural networks that choose actions based on a representation of the current state. Such methods could potentially help developers with even quicker testing methods.

3.2 Playing in Different Styles

While the successful tactics of a Hearthstone player are at least partly determined by the deck, for many decks there are several different playstyles possible, and individual players will often prefer one playstyle over another. Can we create AI agents that can learn and recreate these playing styles, not only playing to win but doing so in the style of a particular player? This is a challenge that would seem to go above and beyond that of creating agents that “simply” play the game to win.

From a game design and development perspective there are several use cases for having agents capable of playing in specific styles. This includes providing examples to players of how to play as part of tutorials, offering interesting adversaries, and testing how some game design change will affect different types of players.

In the case of Hearthstone, perhaps the most important dimension of playing style variation is aggro-control. Playing “aggro” means attacking the adversary early with all available resources, trying to decide the outcome of the game early. Playing “control” is a strategy for a longer game, where the player tries to stop the adversary from dominating the game while building up mana and card combinations for a late-game win. However, combo is a successful strategy that can take the form of aggro or control, but focuses primarily on combining the special effects of cards (Góes et al., 2017).

One way of implementing agents with specific play styles is suggested by the procedural personas concept. This entails modeling differences in play style as differences in objectives and search depth. In a previous application of this idea, personas in a roguelike dungeon crawler game were expressed as combinations of preferences for getting to the exit of a level fast, killing monsters, gathering treasure, and drinking health potions. By varying these preferences and evolving selection functions based on them, the same MCTS-based algorithm can be made to play in very different ways (Holmgård et al., 2014; Holmgard et al., 2018). The same approach could easily be implemented into Hearthstone by including varying preferences for winning early as opposed to late, using spells rather than minions etc.

3.3 Finding Beginner Heuristics

Another challenge which is related to, but not the same as, building or learning agents that can play Hearthstone is automatically finding human-teachable heuristics for playing it. In other words, using algorithms to find simple rules and strategies that can be communicated to humans in order to teach them how to play the game. Think of this as “if you only had one rule for playing Hearthstone, what would it be?”. Maybe it would be something like “if you have a minion and can attack the opponent’s face directly do that; otherwise, attack another minion”. If you had two rules, what would they be?

As an example of an approach to this task, de Mesentier Silva et al. (2016) and de Mesentier Silva et al. (2018a)

developed a method for finding heuristics applied to the card games Blackjack and Texas Hold’em poker. In the most successful approach, genetic programming was used for finding lists of if-then rules (so called “fast and frugal heuristics”). For BlackJack, it was found that as few as five rules could lead to almost-optimal play; these rules are much simpler to learn than the full strategy table 

(de Mesentier Silva et al., 2016). In Texas Hold’em, a set of simple rules were discovered that led to at least novice-level play (de Mesentier Silva et al., 2018a, b).

What would beginner heuristics look like for Hearthstone? This is currently not known. We would need a description language that could capture relevant aspects of the game state as preconditions and relevant categories of in-game actions as consequences. This is an interesting research challenge that would teach us much about game design.

3.4 Identifying Emergent Patterns

Over time, the playerbase of some games develop a metagame, which is often a collection of ideas about how to play the game well. It often includes a taxonomy around moves and action patterns that occur frequently in play, and ascribes varying degrees of strategic importance to them. An example in Chess is the Queen’s Gambit, a centuries old opening move that is popular with players. By developing a shared taxonomy to describe patterns, both players and designers can better analyze, discuss, and evolve a metagame.

Identifying emergent patterns of play and naming the most common or powerful requires human players to possess significant domain knowledge and experience like those developed for Chess (Gobet and Simon, 1996), but AI and computational agents hold a significant advantage over these human analysts. While some human designers isolated common card combinations and plays in the puzzles released with the Boomsday expansion to Hearthstone, it should be possible to computationally analyze, identify, and categorize emergent patterns of gameplay.

Hearthstone has two properties that suggest successful taxonomies can be created. First, it has a large and devoted playerbase, suggesting that any system developed to categorize play can be easily fact-checked with historical data and community knowledge like that available from The key drawback to this is that games with large playerbases are likely to have quite thoroughly cartographed the action space, leaving only niche and novelty plays to be identified. Essentially, it presents the possibility that for many very popular games, this problem has been satisfactorily solved. However, Hearthstone’s other key feature is that Hearthstone is a highly stochastic and evolving game. Blizzard regularly adds and removes cards to and from standard play, resulting in a dramatically and unpredictably shifting metagame every few months. The period immediately after the release of a new card set introduces a unique opportunity for AI to expedite the process of cartographing the game’s newly reshaped action space.

Such work has several immediate short term applications and benefits. Identification of patterns of play is a key step in creating agents that mimic humans or take on specific play styles. Making human-recognizable moves with preexisting strategic connotations better enables both AI designers and the AI agents themselves to understand human play. For designers, this form of work will help identify possibly degenerate and undesirable behavior, like automated bots. As Hearthstone grows in popularity, and winning competitions becomes more valuable (Hodge et al., 2019), combating unfair play is becoming a more serious issue. However, sometimes strange player behavior could be due to an underlying issue in a game where players are somehow able to stall indefinitely, and AI can help identify what moves are leading to these unwanted board states. For players, identifying and categorizing plays can help to create better tutorials and automated trainers. With moves such as the Queen’s Gambit, there are analyses dissecting when they are and are not appropriate. Using data to identify and analyze a single player’s patterns can help to identify why they may be playing particularly well or poorly.

To the best of our knowledge, there is a limited body of work dedicated specifically to identifying known or recurring moves or patterns in games. However, this does not mean that no work has gone into the analysis and classification of moves. The identification of patterns of play is a problem that is closely related to the identification of player behavior. Where player behavior identification attempts to aggregate and analyze longitudinally along a single player’s actions and moves, move pattern labelling aims to latitudinally observe plays across many players to identify profiles that are more endemic to the design of the game or the playerbase at large. There has been substantial interest in player modelling, classification, and clustering in games. For example, work has been done to leverage player telemetry data to attempt to categorize and describe clusters of player behavior  Drachen et al. (2012).

3.5 Difficulty Scaling

Building an agent that plays the game well is an interesting problem in and of itself, but a high-performing agent does not necessarily lead to the most enjoyable user experience. Games like Hearthstone attract players of a wide variety of skill levels, and a game’s far reaching appeal often depends on its ability to accommodate this variety of players. Games commonly offer AI opponents in some discrete set of difficulty levels (easy-hard, 1-10, etc), but the proposed values are ordinal, and may or may not be anchored against human player ability. Consistently providing players significant but not insurmountable difficulty and maintaining that challenge as they improve at the game is a non-trivial, but very important Aponte et al. (2009) problem for developers. This is a distinct problem from having agents that simulate different playstyles. The question is not whether an agent can play like a human, but rather whether they can play as well as or just barely better than a specific individual on command.

In a regular game of Hearthstone against an AI, both players have the same starting life and access to the same set of cards. There is no systemic or mechanical advantage one player has over the other. In this scenario, the challenge of tuning for difficulty lies solely in the implementation of the agent. Important questions to consider when approaching tunable difficulty in this case is how one would define a player-specific difficulty level in addition to how one would implement it. In addition to symmetric play, Hearthstone offers an asymmetric adventure mode similar to those in single-player role-playing games. In many single player games, it is possible to present challenge through asymmetry and some unfair advantage. In Hearthstone, this is manifested as unequal starting life totals or very powerful cards only usable by the opponent. The challenge of implementing these is shifted away from the agent’s implementation and often toward tuning the severity of the asymmetry. Similarly, a developer might want to have access to a “cheating” bot for testing purposes, which is also a different problem than tuning for difficulty.

Work into difficulty scaling has explored a number of avenues toward modifying existing AI algorithms to scale difficulty as well as proposing novel agents that are scalable by design. For example, a technique known as Dynamic Scripting presents an algorithm with several parameters along which one could tune difficulty through manipulations of weights  Spronck et al. (2004). Other work proposes the adaptation of concepts from psychological literature to measure and control the complexity of content created for single player games  Van Lankveld et al. (2008). There is little demonstrated work, however, toward difficulty scaling in symmetrical competitive games with well-established metrics for player skill.

4 Building Decks

While playing the game given an existing deck has so far received the most attention, there are interesting challenges inherent in the domain of building these decks (Fontaine et al., 2019; Bhatt et al., 2018; García-Sánchez et al., 2016, 2018). Human players often build them through experimentation and the evolving meta strategies of expert players, but automatically creating such decks could potentially lead to a richer diversity of meta strategies.

Deciding on which cards to include in a deck often depends on the gameplay mode that a player selects. In the single player modes Tutorial, Adventure, and Missions, the cards are often chosen by the game designer. However, to maintain balance in the multi-player Play mode, players must choose a card format that dictates the types of cards in the decks. Wild cards include any of the over 1900 cards while Standard cards include the first two sets of cards (i.e. Basic and Classic) and any set released in the last two years. Often Standard decks can be composed to between six and eight sets of cards. Such rotation helps keeping the metagame from stagnating.

Another multi-player mode is Arena, where players build their decks one card at a time from a selection of three candidates (i.e. build their draft). Candidates are shuffled after every selection, meaning that players can but are not guaranteed to see their discarded choices again. While previous approaches to computational deckbuilding rely on a priori knowledge of the card pool and good card combinations, Arena mode forces players to make choices in real time without complete knowledge of the available cards. While some approaches could potentially help players draft cards like Bursztein (2016) who predicts cards an opponent will play based on replay data, and Stiegler et al. (2016) who develop a symbolic structure of cards, as of the time of writing there are currently no approaches specifically addressing Arena drafting.

4.1 Transitivity and Dominance of Strategies

Designers spend effort creating and balancing cards; when they introduce these new cards to the community it is important to maintain a degree of consistency with the old cards and decks while simultaneously facilitating the discovery of new deck archetypes. Complete transitivity of a deck space (i.e. a globally optimal deck) would quickly destroy properties of the game, and Hearthstone designers at Blizzard actively adjust properties of older cards or introduce new cards to ensure a variety of winning decks. However, some degree of transitivity is necessary for developing the art of deck building (i.e. randomly built decks should not be on average as good as those crafted with good strategies).

In Hearthstone cards are divided into subsets, currently including Basic, Classic, Expansion, and Adventure. Cards are mostly added through new sets in expansion and adventure. Bhatt et al. (2018) perform one of the first studies of the transitivity of the deck space of Hearthstone cards by holding playing strategies constant and looking only at the 133 cards available in the Basic set available to all players at the start of the game. Preliminary results suggest some degree of transitivity in this space, but from the scope of the experiments the question remains to what degree these decks are transitive and if this transitivity is more or less present in different card sets.

4.2 Deck Analysis: Mapping the Deck Strategy Space

(a) Space of Decks in an ES
(b) Space of Decks in MAP-Elites
Figure 1: Example Search Spaces. Distributions of ten thousand decks generated with a standard (1 + )-ES evolutionary strategy (shown in a) and MAP-Elites with Sliding Boundaries algorithm (shown in b) are plotted along two sample behavior metrics. Decks with the highest fitness are colored darker shades of blue. The x-axis is a measure called strategy alignment, which is a measure of how well the deck performance aligns with the player strategy. The y-axis is a measure of the number of turns taken in 200 hundred games. See Fontaine et al. (2019) for more details.

At the heart of evolutionary computation for deckbuilding is the idea that cards appearing in decks that perform well are more likely to be represented in future decks. Like many optimization algorithms including evolutionary strategies, search is biased toward one or several performance metrics. Examples of such performance metrics are winrate

(García-Sánchez et al., 2016) or the difference in health between players (Bhatt et al., 2018; Fontaine et al., 2019; de Mesentier Silva et al., 2019). In single objective optimization, the idea is that the search space will converge to a relatively small yet powerful space of near-optimal decks (shown in figure 1a).

While the convergence in figure 1a may at first seem to suggest the decks with higher x-values and lower y-values tend to perform best, figure 1b shows a variety of concentrations of highest-performing decks found with an algorithm that promotes deck diversity. Called MAP-Elites (Cully et al., 2015; Mouret and Clune, 2015)

, the algorithm builds a map of the best individuals and stores them if they also have unique behavior vectors. In figure

1 the x-axis is a feature called strategy alignment and the y-axis represents the total number of turns over 200 games, but the number and type of these behaviors is theoretically limitless. Instead of investigating optimal behaviors of the high performing decks, we can look at how well decks tend to perform with different behaviors (Fontaine et al., 2019).

4.3 Deck Analysis: Identifying Cores, Weaknesses, and Strategy

In collectible card games, it is common for players to build decks that contain particular core cards. Often, the rest of the cards in these decks support, strengthen, defend, or maximize the effectiveness of the core. In Hearthstone, examples of core cards include C’Thun, which features a set of cards specifically designed to make C’Thun stronger. A system capable of analyzing decks and identifying its cores is useful in validating a player’s design around their preferred cards and generating more focused suggestions toward better executing on the deck’s theme or core. Some tools like Archetypes 777 developed by the Hearthsim community can help players identify such core cards and sets. However, in combination with powerful gameplaying strategies, such methods could be extended beyond what is currently popular in the metagame.

Another related topic for automated deck analysis involves identifying potential weakness or counters to a deck. In a robust metagame, no single card or deck will be entirely impervious to some sort of check or counter. For example, C’Thun is a powerful core card. However, it can be countered by effects that force a player to discard C’Thun without playing it. Such effects result in a great deal of wasted effort for the player. A system that can identify potential weaknesses to a deck or its core can be useful for systems that generate opponent decks against players or as tutorial generation systems to better enable players to learn about the intricacies of the game’s competitive meta.

While there are some approaches to building decks compatible with a specific agent or playstyle (Bhatt et al., 2018; García-Sánchez et al., 2016, 2018; Fontaine et al., 2019), fewer approaches identify the most effective agent, playstyle, or strategy given a specific deck. This may be useful when players or agents must play with decks they have limited control over, such as Tavern Brawl. This is particularly useful for less experienced players, who may be inclined to build decks with the starter cards they have, but lack an understanding of how to best play their cards. A very basic version of this feature can be found in several games. Yu-Gi-Oh! 5D’s 2011 offered a very basic implementation which indicated what the playstyle of the deck appeared to be and identified several possible counters that players may need to account for.

5 Assisting Players

Another class of problems relate to building systems to assist players. While these problems are related and may be partly addressed with similar methods to AI for game playing and deck building, they are not identical.

5.1 Deck Building Assistance

Building complete decks is an interesting AI challenge and can facilitate game testing, but most players need help completing decks. Assume a player has some favorite cards (e.g. Cthun, Knife Juggler, Hex), and wants AI-based help finding a good balance of additional cards to make a complete and effective deck. One simple approach could be looking at the distribution of the mana costs of cards already included in the deck, and suggest new cards with the appropriate mana costs to balance the distribution; something like this recommendation system already exists in the Hearthstone client. Other ad-hoc solutions888 help players check card tier rankings for Arena Drafts999 But those cards are not necessarily good complements to the cards already chosen. How could we do better?

Luckily, there are many approaches to recommendation items, perhaps because of its importance for e-commerce. Though the problems of recommending new books or clothes to buy for an online shopper differs from that of recommending cards for deck building, some approaches may be transferable; it is also likely that some algorithmic inventions from work on deck building assistance would carry over to other kinds of recommendation. As a potential starting point for a deck building assistance system, one might consider using the Apriori algorithm (Agrawal and Srikant, 1994) on a data set of high-quality decks, as collected from human players or deck-building algorithms Fontaine et al. (2019). By mining the co-occurrences of cards in high-quality decks, the Apriori algorithm would find association rules of the form “if you have card A and card B in your deck, you might want to look at deck Q (because 34% of decks that have cards A and B also have Q”). That is, if you like cards A and B, you may want to consider powerful decks with both. Now, it might be more useful to recommend a general class of cards rather than a particular class of cards and give a better explanation, such as “you need a few minions with Taunt because you have many vulnerable minions”. What algorithm can give us this kind of advice? This is a fertile research problem.

5.2 Gameplay Assistance

The relationship of gameplay assistance to game-playing algorithms is similar to that of deck building assistance’s relationship to deck building algorithms. What we are looking for here are systems that can help players play the game, for example by giving them feedback on how they are doing at the moment, suggesting what move to make next, or proposing a general strategy in response to observed play (e.g. “the opponent seems to be going for rush, focus on taking out their minions”).

One simple form of gameplay assistance could be a system that displays the winning probability at any given state, as calculated e.g. by Monte Carlo simulations, or by a win chance estimator trained on either simulations or logs of human games. Such a win chance estimator could also double as a state value function for a search-based approach to playing Hearthstone. The same approach, training on human or machine gameplay logs, would also work for constructing an action recommender. Win-rate predictors for Hearthstone have been the focus of previous research Jakubik (2018), driven specially by the AAIA Data Mining Competition Janusz et al. (2018). In general, a good starting point for this type of gameplay assistance systems would be a game-playing algorithm; a significant research challenge though, is which information to present to the player and how.

Another kind of gameplay assistance was suggested in the paper I am a legend: hacking Hearthstone using statistical learning methods (Bursztein, 2016). By training on game logs, the system was able achieve high accuracy on predicting the next card played by the opponent, as high as 95% on early game rounds and around 50% in average. The system was labeled “game breaking” by Blizzard, and the creator agreed to not make his system publicly available. This raises questions about exactly how much, and which type, of gameplay assistance we want to have available.

6 Assisting Designers

Hearthstone is designed and developed by some of the world’s foremost experts on online games, and in constant production by a team that intimately knows the game; new card sets, adventures and balance updates are regularly published. While it may be preposterous to suggest to creating new tools for assisting the designers that already know Hearthstone so well, the kind of challenges involved in designing, developing and producing such a game are similar to design and development tasks in many others. Therefore Hearthstone could be a versatile testbed for research on AI-assisted game design tools.

While the related research questions are less popular than game playing, there is active research on how to best use AI methods to assist game designers and developers. The paradigm of mixed-initiative co-creativity (Yannakakis et al., 2014) envisions human designers creating games or other interactive systems in dialogue with an AI system. In it, both the human and the software can contribute and give feedback on what is being created. Examples of systems attempting to enable mixed-initiative creativity in are platform game editors that can regenerate parts of the level while ensuring playability (Smith et al., 2011), physics-based puzzle game editors that solve the puzzle for you (Shaker et al., 2013), strategy game map generators that give feedback on balance and suggestions for how to improve maps (Liapis et al., 2013)

, and recommender systems that recommend new game elements based on machine-learned models of other games

(Machado et al., 2016) or the same game (Guzdial et al., 2018). The following sections describes how these approaches relate to generating cards in Hearthstone, balancing gameplay, and generating tutorials.

6.1 Generating Cards

Hearthstone has hundreds of different cards (over 1900), whereas Magic: The Gathering has more than ten thousand unique cards. Creating a genuinely new card (not essentially identical to some card that is already there), that adds value to the game, and that does not destroy the game’s balance is a challenge. Could we use AI methods to help us here, for example through generating suggestions for new cards? Summerville and Mateas (2016) describes such a system for Magic: The Gathering. The authors trained a sequence-to-sequence network on a large dataset of Magic cards, and use the trained model to generate new cards, including both statistics and descriptive text. The system produces cards that generally have recognizable statistics and grammatical descriptive text, but are often unbalanced, inconsistent or in other ways game-breaking. But in many cases, these cards can easily be turned into playable cards with a little human intervention.

Ling et al. (2016) focus on the problem of generating valid code from natural language descriptions on Hearthstone and Magic cards. This is in a sense the other side of the coin, and would be needed as a part of a functioning card generation system, to use the code for these decks for artificial agents that can playtest new cards.

6.2 Balancing the Game

Creating a card is one thing, but it is difficult to create a new card while maintaining balance in gameplay. Many cards that seem to have reasonable mana cost and attack and defense values may, when combined be certain other cards, enable game-breaking strong combos. Balancing a collectible card game is a major undertaking, and is typically done manually through extensive human playtesting and in response to observed player behavior. It could be seen as an optimization process, where the desired outcome is to have a reasonably low range of usefulness of individual cards, or alternatively to have a large range of useful card combinations and low variance between the value of these. However, formulated in this way, the optimization problem is almost certainly untractable for games with large deck spaces such as Hearthstone

García-Sánchez et al. (2018). For somewhat simpler card-based games such as Dominion, evolutionary approaches to this problem can work well, as demonstrated by Mahlmann et al. (2012). In Hearthstone, a feasible approach to computational balancing might be to search for decks that involve the proposed new card and nerf (lower the stats of) the card if a deck is found that is too strong.

6.3 Generating Tutorials

Tutorials are an important part of modern video games, which are primarily learned through play. However, constructing effective and accessible tutorials is complex and labor-intensive, and AI methods could help reduce this burden for game developers. Some recent approach address automatically constructing game tutorials. For example, the AtDelfi system analyzes mechanics of arcade-style games to generate videos and written instructions that instruct players to play the game Green et al. (2018). Hearthstone could serve as a suitable testbed for algorithms that generate challenge problems that teach you the mechanics of the game and useful heuristics, or demonstrations of such mechanics and heuristics. Examples of desirable results would be the introductory puzzles present in the card games Eternal LLC (2016) and Faeria SA (2017).

7 Conclusion

We have described a large number of research and application challenges for AI arising from a single game, Blizzard’s Hearthstone. These applications span almost the entire field of artificial intelligence and games and the reader may wish to compare the particular challenges to those outlined by Yannakakis and Togelius (2018). At the same time, the nature of this particular game (such as its hidden information, adversarial nature, discrete state and action space, high and variable branching factor, stochasticity, relative ease of forward modeling, and separation of deck building from playing) shape the particular form of the challenges it poses. As such, the challenges are rather different from those posed by some of the dominating game-based AI benchmarks, such as Atari/ALE, Doom, Chess, and Go. Notably the dominant approaches for playing these games (e.g. training deep neural networks with reinforcement learning to play based on pixel inputs, or searching ahead in the game tree with MCTS) are missing from the suggestions above. While it might be possible to use Deep Q-learning to learn to play based on pixel inputs, it would be complicating matters so much as to almost be nonsensical when much better representations of the game state is available; and it is very hard to use MCTS effectively beyond a single turn given the partial observability of the game. But the diverse challenges posed by Hearthstone are no less interesting from an AI perspective. This underscores the need to choose the game to use for your AI benchmark carefully, and play and think about that game to understand the challenges it poses.

Thanks to Matthew Fontaine, Aditya Bhatt, Connor Watson, Param Trivedi for their contributions to research that has informed this paper. Additionally, we thank Andy Nealen and Alex Zook for useful discussions. Finally, we are happy that all the hours we spent playing the game could contribute to something useful, or at least publishable.


  • Agrawal and Srikant (1994) Agrawal, R. and Srikant, R. (1994). Fast algorithms for mining association rules. In Proc. 20th int. conf. very large data bases, VLDB, volume 1215, pages 487–499.
  • Aponte et al. (2009) Aponte, M.-V., Levieux, G., and Natkin, S. (2009). Scaling the level of difficulty in single player video games.
  • Bellemare et al. (2013) Bellemare, M. G., Naddaf, Y., Veness, J., and Bowling, M. (2013). The arcade learning environment: An evaluation platform for general agents. Journal of Artificial Intelligence Research, 47:253–279.
  • Bhatt et al. (2018) Bhatt, A., Lee, S., de Mesentier Silva, F., Watson, C. W., Togelius, J., and Hoover, A. K. (2018). Exploring the hearthstone deck space. In Proceedings of the 13th International Conference on the Foundations of Digital Games, page 18. ACM.
  • Blizzard (2014) Blizzard (2014). Hearthstone.
  • Browne (2011) Browne, C. (2011). Yavalath. In Evolutionary Game Design, pages 75–85. Springer.
  • Bursztein (2016) Bursztein, E. (2016). I am a legend: Hacking hearthstone using statistical learning methods. In CIG, pages 1–8.
  • Cai and Wunsch (2007) Cai, X. and Wunsch, D. C. (2007). Computer go: A grand challenge to ai. In Challenges for Computational Intelligence, pages 443–465. Springer.
  • Coulom (2006) Coulom, R. (2006). Efficient selectivity and backup operators in monte-carlo tree search. In International conference on computers and games, pages 72–83. Springer.
  • Cully et al. (2015) Cully, A., Clune, J., Tarapore, D., and Mouret, J.-B. (2015). Robots that can adapt like animals. Nature, 521(7553):503.
  • de Mesentier Silva et al. (2019) de Mesentier Silva, F., Canaan, R., Lee, S., Togelius, J., and Hoover, A. K. (2019). Evolving the hearthstone meta. In IEEE Conference on Games (CoG).
  • de Mesentier Silva et al. (2016) de Mesentier Silva, F., Isaksen, A., Togelius, J., and Nealen, A. (2016). Generating heuristics for novice players. In 2016 IEEE Conference on Computational Intelligence and Games (CIG), pages 1–8. IEEE.
  • de Mesentier Silva et al. (2018a) de Mesentier Silva, F., Togelius, J., Lantz, F., and Nealen, A. (2018a). Generating beginner heuristics for simple texas hold’em. In Proceedings of the Genetic and Evolutionary Computation Conference, pages 181–188. ACM.
  • de Mesentier Silva et al. (2018b) de Mesentier Silva, F., Togelius, J., Lantz, F., and Nealen, A. (2018b). Generating novice heuristics for post-flop poker. In 2018 IEEE Conference on Computational Intelligence and Games (CIG), pages 1–8. IEEE.
  • Dockhorn et al. (2018) Dockhorn, A., Frick, M., Akkaya, Ü., and Kruse, R. (2018). Predicting opponent moves for improving hearthstone ai. In International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems, pages 621–632. Springer.
  • Drachen et al. (2012) Drachen, A., Sifa, R., Bauckhage, C., and Thurau, C. (2012). Guns, swords and data: Clustering of player behavior in computer games in the wild. In Computational Intelligence and Games (CIG), 2012 IEEE Conference on, pages 163–170. IEEE.
  • Elias et al. (2012) Elias, G. S., Garfield, R., and Gutschera, K. R. (2012). Characteristics of games. MIT Press.
  • Ensmenger (2012) Ensmenger, N. (2012). Is chess the drosophila of artificial intelligence? a social history of an algorithm. Social Studies of Science, 42(1):5–30.
  • Fontaine et al. (2019) Fontaine, M. C., Lee, S., Soros, L. B., Silva, F. D. M., Togelius, J., and Hoover, A. K. (2019). Mapping hearthstone deck spaces through map-elites with sliding boundaries.
  • García-Sánchez et al. (2018) García-Sánchez, P., Tonda, A., Mora, A. M., Squillero, G., and Merelo, J. J. (2018).

    Automated playtesting in collectible card games using evolutionary algorithms: A case study in hearthstone.

    Knowledge-Based Systems, 153:133–146.
  • García-Sánchez et al. (2016) García-Sánchez, P., Tonda, A., Squillero, G., Mora, A., and Merelo, J. J. (2016). Evolutionary deckbuilding in hearthstone. In Computational Intelligence and Games (CIG), 2016 IEEE Conference on, pages 1–8. IEEE.
  • Gobet and Simon (1996) Gobet, F. and Simon, H. A. (1996). The roles of recognition processes and look-ahead search in time-constrained expert problem solving: Evidence from grand-master-level chess. Psychological science, 7(1):52–55.
  • Góes et al. (2017) Góes, L. F. W., Da Silva, A. R., Saffran, J., Amorim, A., França, C., Zaidan, T., Olímpio, B. M., Alves, L. R., Morais, H., Luana, S., et al. (2017). Honingstone: Building creative combos with honing theory for a digital card game. IEEE Transactions on Computational Intelligence and AI in Games, 9(2):204–209.
  • Green et al. (2018) Green, M. C., Khalifa, A., Barros, G. A., Machado, T., Nealen, A., and Togelius, J. (2018). Atdelfi: automatically designing legible, full instructions for games. In Proceedings of the 13th International Conference on the Foundations of Digital Games, page 17. ACM.
  • Guzdial et al. (2018) Guzdial, M., Liao, N., and Riedl, M. (2018). Co-creative level design via machine learning. arXiv preprint arXiv:1809.09420.
  • Hodge et al. (2019) Hodge, V., Sephton, N., Devlin, S., Cowling, P., Goumagias, N., Shao, J., Purvis, K., Cabras, I., Fernandes, K., and Li, F. (2019). How the business model of customisable card games influences player engagement. IEEE Transactions on Games.
  • Holmgard et al. (2018) Holmgard, C., Green, M. C., Liapis, A., and Togelius, J. (2018). Automated playtesting with procedural personas with evolved heuristics. IEEE Transactions on Games.
  • Holmgård et al. (2014) Holmgård, C., Liapis, A., Togelius, J., and Yannakakis, G. N. (2014). Evolving personas for player decision modeling. In Computational Intelligence and Games (CIG), 2014 IEEE Conference on, pages 1–8. IEEE.
  • Jakubik (2018) Jakubik, J. (2018). A neural network approach to hearthstone win rate prediction. In 2018 Federated Conference on Computer Science and Information Systems (FedCSIS), pages 185–188. IEEE.
  • Janusz et al. (2017) Janusz, A., Świechowski, M., and Tajmajer, T. (2017). Helping ai to play hearthstone: Aaia’17 data mining challenge. arXiv preprint arXiv:1708.00730.
  • Janusz et al. (2018) Janusz, A., Tajmajer, T., Świechowski, M., Grad, Ł., Puczniewski, J., and Ślȩzak, D. (2018). Toward an intelligent hs deck advisor: Lessons learned from AAIA’18 data mining competition. In 2018 Federated Conference on Computer Science and Information Systems (FedCSIS), pages 189–192. IEEE.
  • Kasparov (2017) Kasparov, G. (2017). Deep Thinking: Where Machine Intelligence Ends and Human Creativity Begins. PublicAffairs.
  • Liapis et al. (2013) Liapis, A., Yannakakis, G. N., and Togelius, J. (2013). Sentient sketchbook: Computer-aided game level authoring. In FDG, pages 213–220.
  • Ling et al. (2016) Ling, W., Grefenstette, E., Hermann, K. M., Kočiskỳ, T., Senior, A., Wang, F., and Blunsom, P. (2016). Latent predictor networks for code generation. arXiv preprint arXiv:1603.06744.
  • LLC (2016) LLC, D. W. D. (2016). Eternal.
  • Lucas (2009) Lucas, S. M. (2009). Computational intelligence and ai in games: a new ieee transactions. IEEE Transactions on Computational Intelligence and AI in Games, 1(1):1–3.
  • Machado et al. (2016) Machado, T., Bravi, I., Wang, Z., Nealen, A., and Togelius, J. (2016). Shopping for game mechanics.
  • Mahlmann et al. (2012) Mahlmann, T., Togelius, J., and Yannakakis, G. N. (2012). Evolving card sets towards balancing dominion. In Evolutionary computation (CEC), 2012 IEEE congress on, pages 1–8. IEEE.
  • Mnih et al. (2015) Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., Graves, A., Riedmiller, M., Fidjeland, A. K., Ostrovski, G., et al. (2015). Human-level control through deep reinforcement learning. Nature, 518(7540):529.
  • Mouret and Clune (2015) Mouret, J.-B. and Clune, J. (2015). Illuminating search spaces by mapping elites. arXiv preprint arXiv:1504.04909.
  • of the Coast (1993) of the Coast, W. (1993). Magic: The gathering.
  • OpenAI (2018) OpenAI (2018). Openai five.
  • SA (2017) SA, A. (2017). Faeria.
  • Santos et al. (2017) Santos, A., Santos, P. A., and Melo, F. S. (2017). Monte carlo tree search experiments in hearthstone. In Computational Intelligence and Games (CIG), 2017 IEEE Conference on, pages 272–279. IEEE.
  • Shaker et al. (2013) Shaker, N., Shaker, M., and Togelius, J. (2013). Ropossum: An authoring tool for designing, optimizing and solving cut the rope levels. In AIIDE.
  • Shannon (1950) Shannon, C. E. (1950). Xxii. programming a computer for playing chess. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science, 41(314):256–275.
  • Silver et al. (2016) Silver, D., Huang, A., Maddison, C. J., Guez, A., Sifre, L., Van Den Driessche, G., Schrittwieser, J., Antonoglou, I., Panneershelvam, V., Lanctot, M., et al. (2016). Mastering the game of go with deep neural networks and tree search. Nature, 529(7587):484.
  • Smith et al. (2011) Smith, G., Whitehead, J., and Mateas, M. (2011). Tanagra: Reactive planning and constraint solving for mixed-initiative level design. IEEE Transactions on Computational Intelligence and AI in Games, 3(3):201–215.
  • Spronck et al. (2004) Spronck, P., Sprinkhuizen-Kuyper, I., and Postma, E. (2004). Difficulty scaling of game ai. In Proceedings of the 5th International Conference on Intelligent Games and Simulation (GAME-ON 2004), pages 33–37.
  • Stiegler et al. (2017) Stiegler, A., Dahal, K., Maucher, J., and Livingstone, D. (2017). Symbolic reasoning for hearthstone. IEEE Transactions on Computational Intelligence and AI in Games.
  • Stiegler et al. (2016) Stiegler, A., Messerschmidt, C., Maucher, J., and Dahal, K. (2016). Hearthstone deck-construction with a utility system. In Software, Knowledge, Information Management & Applications (SKIMA), 2016 10th International Conference on, pages 21–28. IEEE.
  • Summerville and Mateas (2016) Summerville, A. J. and Mateas, M. (2016). Mystical tutor: A magic: The gathering design assistant via denoising sequence-to-sequence learning. In Twelfth Artificial Intelligence and Interactive Digital Entertainment Conference.
  • Świechowski et al. (2018) Świechowski, M., Tajmajer, T., and Janusz, A. (2018). Improving hearthstone ai by combining mcts and supervised learning algorithms. arXiv preprint arXiv:1808.04794.
  • Turing (1953) Turing, A. M. (1953). Digital computers applied to games. Faster than Thought.
  • Van Lankveld et al. (2008) Van Lankveld, G., Spronck, P., and Rauterberg, M. (2008). Difficulty scaling through incongruity. In AIIDE.
  • Vinyals et al. (2019) Vinyals, O., Babuschkin, I., Chung, J., Mathieu, M., Jaderberg, M., Czarnecki, W. M., Dudzik, A., Huang, A., Georgiev, P., Powell, R., Ewalds, T., Horgan, D., Kroiss, M., Danihelka, I., Agapiou, J., Oh, J., Dalibard, V., Choi, D., Sifre, L., Sulsky, Y., Vezhnevets, S., Molloy, J., Cai, T., Budden, D., Paine, T., Gulcehre, C., Wang, Z., Pfaff, T., Pohlen, T., Wu, Y., Yogatama, D., Cohen, J., McKinney, K., Smith, O., Schaul, T., Lillicrap, T., Apps, C., Kavukcuoglu, K., Hassabis, D., and Silver, D. (2019). AlphaStar: Mastering the Real-Time Strategy Game StarCraft II.
  • Ward and Cowling (2009) Ward, C. D. and Cowling, P. I. (2009). Monte carlo search applied to card selection in magic: The gathering. In 2009 IEEE Symposium on Computational Intelligence and Games, pages 9–16. IEEE.
  • Yannakakis et al. (2014) Yannakakis, G. N., Liapis, A., and Alexopoulos, C. (2014). Mixed-initiative co-creativity. In FDG.
  • Yannakakis and Togelius (2018) Yannakakis, G. N. and Togelius, J. (2018). Artificial Intelligence and Games. Springer.
  • Yee (2006) Yee, N. (2006). Motivations for play in online games. CyberPsychology & behavior, 9(6):772–775.
  • Zhang and Buro (2017) Zhang, S. and Buro, M. (2017). Improving hearthstone ai by learning high-level rollout policies and bucketing chance node events. In 2017 IEEE Conference on Computational Intelligence and Games (CIG), pages 309–316. IEEE.
  • Zopf (2015) Zopf, M. (2015). A comparison between the usage of flat and structured game trees for move evaluation in hearthstone. Master), Technische Universität Darmstadt, Darmstadt, Germany.