# On Relevant Equilibria in Reachability Games

We study multiplayer reachability games played on a finite directed graph equipped with target sets, one for each player. In those reachability games, it is known that there always exists a Nash equilibrium (NE) and a subgame perfect equilibrium (SPE). But sometimes several equilibria may coexist such that in one equilibrium no player reaches his target set whereas in another one several players reach it. It is thus very natural to identify "relevant" equilibria. In this paper, we consider different notions of relevant equilibria including Pareto optimal equilibria and equilibria with high social welfare. We provide complexity results for various related decision problems.

## Authors

• 13 publications
• 9 publications
• 6 publications
• 2 publications
• ### The Complexity of Subgame Perfect Equilibria in Quantitative Reachability Games (full version)

We study multiplayer quantitative reachability games played on a finite ...
05/02/2019 ∙ by Thomas Brihaye, et al. ∙ 0

• ### A Note on the Nash Equilibria of Some Multi-Player Reachability / Safety Games

In this short note we study a class of multi-player, turn-based games wi...
10/04/2018 ∙ by Athanasios Kehagias, et al. ∙ 0

• ### Reward Sharing Schemes for Stake Pools

We introduce and study reward sharing schemes that promote the fair form...
07/30/2018 ∙ by Lars Brünjes, et al. ∙ 0

• ### Nash Equilibria on (Un)Stable Networks

While individuals may selfishly choose their optimal behaviors (Nash, 19...
12/31/2018 ∙ by Anton Badev, et al. ∙ 0

• ### On Strong Equilibria and Improvement Dynamics in Network Creation Games

We study strong equilibria in network creation games. These form a class...
11/09/2017 ∙ by Tomasz Janus, et al. ∙ 0

• ### Approximate Solutions to a Class of Reachability Games

In this paper, we present a method for finding approximate Nash equilibr...
11/01/2020 ∙ by David Fridovich-Keil, et al. ∙ 0

• ### Computing large market equilibria using abstractions

Computing market equilibria is an important practical problem for market...
01/18/2019 ∙ by Christian Kroer, et al. ∙ 4

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

Two-player zero-sum games played on graphs are commonly used to model reactive systems where a system interacts with its environment [16]. In such setting the system wants to achieve a goal - to respect a certain property - and the environment acts in an antagonistic way. The underlying game is defined as follows: the two players are the system and the environment, the vertices of the graph are all the possible configurations in which the system can be and an infinite path in this graph depicts a possible sequence of interactions between the system and its environment. In such a game, each player chooses a strategy: it is the way he plays given some information about the game and past actions of the other player. Following a strategy for each player results in a play in the game. Finding how the system can ensure that a given property is satisfied amounts to find, if it exists, a winning strategy for the system in this game. For some situations, this kind of model is too restrictive and a setting with more than two agents such that each of them has his own not necessarily antagonistic objective is more realistic. These games are called multiplayer non zero-sum games. In this setting, the solution concept of winning strategy is not suitable anymore and different notions of equilibria can be studied.

In this paper, we focus on Nash equilibrium (NE) [14]: given a strategy for each player, no player has an incentive to deviate unilaterally from his strategy. We also consider the notion of subgame perfect equilibrium (SPE) well suited for games played on graphs [15]. We study these two notions of equilibria on reachability games. In reachability games, we equip each player with a subset of vertices of the graph game that he wants to reach. We are interested in both the qualitative and quantitative settings. In the qualitative setting, each player only aims at reaching his target set, unlike the quantitative setting where each player wants to reach his target set as soon as possible.

It is well known that both NEs and SPEs exist in both qualitative and quantitative reachability games. But, equilibria such that no player reaches his target set and equilibria such that some players reach it may coexist. This observation has already been made in [19, 18]. In such a situation, one could prefer the second situation to the first one. In this paper, we study different versions of relevant equilibria.

#### 1.0.1 Contributions

For quantitative reachability games, we focus on the following three kinds of relevant equilibria: constrained equilibria, equilibria optimizing social welfare and Pareto optimal equilibria. For constrained equilibria, we aim at minimizing the cost of each player i.e., the number of steps it takes to reach his target set (Problem 1). For equilibria optimizing social welfare, a player does not only want to minimize his own cost, he is also committed to maximizing the social welfare (Problem 2). For Pareto optimal equilibria, we want to decide if there exists an equilibrium such that the tuple of the costs obtained by players following this equilibrium is Pareto optimal in the set of all the possible costs that players can obtain in the game (Problem 3). We consider the decision variant of Problems 1 and 2; and the qualitative adaptations of the three problems.

Our main contributions are the following.(i) We study the complexity of the three decision problems. Our results gathered with previous works are summarized in Table 1.(ii) We characterize a sufficient finite-memory to solve the three decision problems. Our results and others from previous works are given in Table 1.(iii) We identify a subclass of reachability games in which there always exists an SPE where each player reaches his target set.(iv) Given a play, we provide a characterization which guarantees that this play is the outcome of an NE. This characterization is based on the values in the associated two-player zero-sum games called coalitional games.

#### 1.0.2 Related work

There are many results on NEs and SPEs played on graphs, we refer the reader to [9] for a survey and an extended bibliography. We here focus on the results directly related to our contributions.

Regarding Problem 1, for NEs, it is shown NP-complete only in the qualitative setting [10]; for SPEs it is shown PSPACE-complete in both the qualitative and quantitative settings in [4, 6, 5]. Notice that in [19], variants of Problem 1 for games with Streett, parity or co-Büchi winning conditions are shown NP-complete and decidable in polynomial time for Büchi.

Regarding Problem 2, in the setting of games played on matrices, deciding the existence of an NE such that the expected social welfare is at most is NP-hard [11]. Moreover, in [1] it is shown that deciding the existence of an NE which maximizes the social welfare is undecidable in concurrent games in which a cost profile is associated only with terminal nodes.

Regarding Problem 3, in the setting of zero-sum two-player multidimensional mean-payoff games, the Pareto-curve (the set of maximal thresholds that a player can force) is studied in [2] by giving some properties on the geometry of this set. The autors provide a algorithm to decide if this set intersects a convex set defined by linear inequations.

Regarding the memory, in [7] it is shown that there always exists an NE with memory at most in quantitative reachability games, without any constraint on the cost of the NE. It is shown in [18] that, in multiplayer games with -regular objectives, there exists an SPE with a given payoff if and only if there exists an SPE with the same payoff but with finite memory. Moreover, in [4] it is claimed that it is sufficient to consider strategies with an exponential memory to solve Problem 1 for SPE in qualitative reachability games.

Finally, we can find several kinds of outcome characterizations for Nash equilibria and variants, e.g., in multiplayer games equipped with prefix-linear cost functions and such that the vertices in coalitional games have a value (summarized in [9]), in multiplayer games with prefix-independent Borel objectives [19], in multiplayer games with classical -regular objectives (as reachability) by checking if there exists a play which satisfies an LTL formula [10], in concurrent games [12], etc. Such characterizations are less widespread for subgame perfect equilibria, but one can recover one for quantitative reachability games thanks to a value-iteration procedure [6].

#### 1.0.3 Structure of the paper

Due to the lack of space, we decide to only detail results for quantitative reachability games while results for qualitative reachability games are only summarized in Table 1. In Section 2, we introduce the needed background and define the different studied problems. In Section 3, we identify families of reachability games for which there always exists a relevant equilibrium, for different notions of relevant equilibrium. In Section 4, we provide the main ideas necessary to obtain our complexity results (see Table 1). The detailed proofs for the quantitative reachability setting, together with additional results on qualitative reachability games are provided in the appendices.

## 2 Preliminaries and studied problems

#### 2.0.1 Arena, game and strategies

An arena is a tuple such that: (i) is a finite set of players; (ii) is a finite set of vertices; (iii) is a set of edges such that for all there exists such that and (iv) is a partition of between the players.

A play in is an infinite sequence of vertices such that for all , . A history is a finite sequence with defined similarly. The length of is the number of its edges. We denote the set of plays by and the set of histories by . Moreover, the set is the set of histories such that their last vertex is a vertex of player , i.e. .

Given a play and , the prefix of is denoted by and its suffix by . A play is called a lasso if it is of the form with . Notice that is not necessarily a simple cycle. The length of a lasso is the length of .

A game is an arena equipped with a cost function profile such that for all , is a cost function which assigns a cost to each play for player . We also say that the play has cost profile . Given two cost profiles , we say that if and only if for all , .

An initial vertex is often fixed, and we call an initialized game. A play (resp. a history) of is then a play (resp. a history) of starting in . The set of such plays (resp. histories) is denoted by (resp. ). The notation is used when these histories end in a vertex .

Given a game , a strategy for player is a function . It assigns to each history , with , a vertex such that . In an initialized game , needs only to be defined for histories starting in . We denote by the set of strategies for Player . A play is consistent with if for all , . A strategy is positional if it only depends on the last vertex of the history, i.e., for all . It is finite-memory if it can be encoded by a finite-state machine.

A strategy profile is a tuple of strategies, one for each player. Given an initialized game and a strategy profile , there exists an unique play from consistent with each strategy . We call this play the outcome of and denote it by . We say that has cost profile .

#### 2.0.2 Quantitative reachability games

In this article, we are interested in reachability games: each player has a target set of vertices that he wants to reach.

###### Definition 1

A quantitative reachability game is a game enhanced with a target set for each player and for all the cost function is defined as follows: for all : if is the least index such that and if such index does not exist.

In quantitative reachability games, players have to pay a cost equal to the number of edges until visiting their own target set or if it is not visited. Thus each player aims at minimizing his cost.

#### 2.0.3 Solution concepts

In the multiplayer game setting, the solution concepts usually studied are equilibria. We recall the concepts of Nash equilibrium and subgame perfect equilibrium.

Let be a strategy profile in an initialized game . When we highlight the role of player , we denote by where is the profile . A strategy is a deviating strategy of Player , and it is a profitable deviation for him if .

The notion of Nash equilibrium is classical: a strategy profile in an initialized game is a Nash equilibrium (NE) if no player has an incentive to deviate unilaterally from his strategy, i.e. no player has a profitable deviation.

###### Definition 2 (Nash equilibrium)

Let be an initialized quantitative reachability game. The strategy profile is an NE if for each and each deviating strategy of Player , we have .

When considering games played on graphs, a useful refinement of NE is the concept of subgame perfect equilibrium (SPE) which is a strategy profile that is an NE in each subgame. Formally, given a game , an initial vertex , and a history , the initialized game such that where for all and is called a subgame of . Notice that is a subgame of itself. Moreover if is a strategy for player  in , then denotes the strategy in such that for all histories , . Similarly, from a strategy profile in , we derive the strategy profile in .

###### Definition 3 (Subgame perfect equilibrium)

Let be an initialized game. A strategy profile is an SPE in if for all , is an NE in .

Clearly, any SPE is an NE and it is stated in Theorem 2.1 in [3] that there always exists an SPE (and thus an NE) in quantitative reachability games.

#### 2.0.4 Studied problems

We conclude this section with the problems studied in this article. Let us first recall the concepts of social welfare and Pareto optimality. Let be an initialized quantitative reachability game with . Given , we denote by the set of players who visit their target set along , i.e., .111We can easily adapt this definition to histories. The social welfare of , denoted by , is the pair . Note that it takes into account both the number of players who visit their target set and their accumulated cost to reach those sets. Finally, let . A cost profile is Pareto optimal in if it is minimal in with respect to the componentwise ordering on 222For convenience, we prefer to say that is Pareto optimal in rather than in ..

Let us now state the studied decision problems. The first two problems are classical: they ask whether there exists a solution (NE or SPE) satisfying certain requirements that impose bounds on either or on .

###### Problem 1 (Threshold decision problem)

Given an initialized quantitative reachability game , given a threshold , decide whether there exists a solution such that .

The most natural requirements are to impose upper bounds on the costs that the players have to pay and no lower bounds. One might also be interested in imposing an interval in which must lie the cost paid by Player .

In the second problem, constraints are imposed on the social welfare, with the aim to maximize it. We use the lexicographic ordering on such that if and only if (i) or (ii) and .

###### Problem 2 (Social welfare decision problem)

Given an initialized quantitative reachability game , given two thresholds and , decide whether there exists a solution such that .

Notice that with the lexicographic ordering, we want to first maximize the number of players who visit their target set, and then to minimize the accumulated cost to reach those sets. Let us now state the last studied problem.

###### Problem 3 (Pareto optimal decision problem)

Given an initialized quantitative reachability game decide whether there exists a solution in such that is Pareto optimal in .

###### Remark 1

Problems 1 and 2 impose constraints with large inequalities. We could also impose strict inequalities or even a mix of strict and large inequalities. The results of this article can be easily adapted to those variants.

We conclude this section with an illustrative example.

###### Example 1

Consider the quantitative reachability game of Figure 1. We have two players such that the vertices of Player  (resp. Player

) are rounded (resp. rectangular) vertices. For the moment, the reader should not consider the value indicated on the right of the vertices’ labeling. Moreover

and . In this figure, an edge labeled by should be understood as a path from to with length . Observe that and are both reachable from the initial vertex . Moreover the two Pareto optimal cost profiles are and : take a play with prefix in the first case, and a play with prefix in the second case.

For this example, we claim that there is no NE (and thus no SPE) such that its cost profile is Pareto optimal (see Problem 3). Assume the contrary and suppose that there exists an NE such that its outcome has cost profile , meaning that begins with . Then Player  has a profitable deviation such that after history he goes to instead of in a way to pay a cost of instead of , which is a contradiction. Similarly assume that there exists an NE such that its outcome has cost profile , meaning that begins with . Then Player  has a profitable deviation such that after history he goes to instead of , again a contradiction. So there is no NE in such that is Pareto optimal in .

The previous discussion shows that there is no NE such that (see Problem 1). This is no longer true with . Indeed, one can construct an NE whose outcome has prefix and cost profile . This also shows that there exists an NE (the same as before) that satisfies (with both players visit their target set and their accumulated cost to reach it equals ). ∎

## 3 Existence problems

In this section, we show that for particular families of reachability games and requirements, there is no need to solve the related decision problems because they always have a positive answer in this case.

We begin with the family constituted by all reachability games with a strongly connected arena. The next theorem then states that there always exists a solution that visits all non empty target sets.

###### Theorem 3.1

Let be an initialized quantitative reachability game such that its arena is strongly connected. There exists an SPE (and thus an NE) such that its outcome visits all target sets , , that are non empty.

Let us comment this result. For this family of games, the answer to Problem 1 is always positive for particular thresholds. In case of quantitative reachability, take strict constraints if and large constraints otherwise. The answer to Problem 2 is also always positive for threshold and .

In the statement of Theorem 3.1, as the arena is strongly connected, is non empty if and only if is reachable from . Also notice that the hypothesis that the arena is strongly connected is necessary. Indeed, it is easy to build an example with two players (Player and Player ) such that from it is not possible to reach both and .

We now turn to the second result of this section. The next theorem states that even with only two players there exists an initialized quantitative reachability game that has no NE with a cost profile which is Pareto optimal. To prove this result, we only have to come back to the quantitative reachability game of Figure 1. We explained in Example 1 that there is no NE in this game such that its cost profile is Pareto optimal.

###### Theorem 3.2

There exists an initialized quantitative reachability game with that has no NE with a cost profile which is Pareto optimal in .

Notice that in the qualitative setting, in two-player games, there always exists an NE (resp. SPE) such that the gain profile333In the qualitative setting, each player obtain a gain that he wants to maximize: either 1 (if he visits his target set) or 0 (otherwise), all definitions are adapted accordingly. is Pareto optimal in however this existence result cannot be extended to three players.

## 4 Solving decision problems

In this section, we provide the complexity results for the different problems without any assumption on the arena of the game. Even if we provide complexity lower bounds, the main part of our contribution is to give the upper bounds. Roughly speaking the decision algorithms work as follows: they guess a path and check that it is the outcome of an equilibrium satisfying the relevant property (such as Pareto optimality). In order to verify that a path is an equilibrium outcome, we rely on the outcome characterization of equilibria, presented in Section 4.2. These characterizations rely themselves on the notion of -consistent play, introduced in Section 4.1. As the guessed path should be finitely representable, we show that we can only consider -consistent lassoes, in Section 4.3. Finally, we expose the philosophy of the algorithms providing the upper bounds on the complexity of the three problems in Section 4.4.

### 4.1 λ-consistent play

We here define the labeling function, used to obtain the outcome characterization of equilibria. Given a vertex along a play , intuitively, the value represents the maximal number of steps within which the player who owns this vertex should reach his target set along starting from . A play which satisfies the constraints given by is called a -consistent play.

###### Definition 4 (λ-consistent play)

Let be a quantitative reachability game and be a labeling function. Let be a play, we say that is -consistent if for all and all such that and :

###### Example 2

Let us come back to Example 1 and assume that the values indicated on the right of the vertices’ labeling represent the valuation of a labeling function . Let us first consider the play with cost profile . We have that but . This means that is not -consistent. Secondly, one can easily see that the play is -consistent.

### 4.2 Characterizations

#### 4.2.1 Outcome characterization of Nash equilibria

To define the labeling function which allows us to obtain this characterization, we need to study the rational behavior of one player playing against the coalition of the other players. In order to do so, with a quantitative reachability game , we can associate two-player zero-sum quantitative games [7]. For each , we depict by the (quantitative) coalitional game associated with Player . In such a game Player (which becomes Player ) wants to reach the target set within a minimum number of steps, and the coalition of all players except Player (which forms one player called Player , aka ) aims to avoid it or, if it is not possible, maximize the number of steps until reaching .

Given a coalitional game and a vertex , the value of from , depicted by , allows us to know what is the lowest (resp. greatest) cost (resp. gain) that Player (resp. Player ) can ensure to obtain from . Moreover, as quantitative coalitional games are determined these values always exist and can be computed in polynomial time [7, 8, 13].

An optimal strategy for Player (resp. Player ) in a coalitional game is a strategy which ensures that, from all vertex , Player (resp. Player ) will pay (resp. obtain) at most by following this strategy whatever the strategy of the other player. For each , we know that there always exist optimal strategies for both players in . Moreover, we can always found optimal strategies which are positional [7].

In our characterization, we show that the outcomes of NEs are exactly the plays which are -consistent, with the labeling function defined in this way: for all ,

###### Theorem 4.1 (Characterization of NEs)

Let be a quantitative reachability game and let be a play, the next assertions are equivalent:

1. there exists an NE such that ;

2. the play is -consistent.

Additionally, if is a lasso, we can replace the first item by: there exists an NE with memory in and such that .

The main idea is that if the second assertion is false, then there exists a player who has an incentive to deviate along . Indeed, if there exists such that () it means that Player can ensure a better cost for him even if the other players play in coalition and in an antagonistic way. Thus, Player has a profitable deviation. For the second implication, the Nash equilibrium is defined as follows: all players follow the outcome but if one player, assume it is Player , deviates from the other players form a coalition and punish the deviator by playing the optimal strategy of player in the coalitional game . Thus, if , a player has to remember: (i) to know both what he has to play and if someone has deviated and (ii) who is the deviator.

###### Example 3

Let us go back to Example 2, in this example the used labeling function is in fact the labeling function . We proved in Example 2 that the play is not -consistent and so not the outcome of an NE by Theorem 4.1. On the contrary, we have seen that the play is -consistent and it means that it is the outcome of an NE (again by Theorem 4.1). Notice that we have already proved these two facts in Example 1.

#### 4.2.2 Outcome characterization of subgame perfect equilibria

In the previous section, we proved that the set of plays which are -consistent is equal to the set of outcomes of NEs. We now want to have the same kind of characterization for SPEs. We may not use the notion of -consistent plays because there exist plays which are -consistent but which are not the outcome of an SPE. But, we can recover the characterization of SPEs thanks to a different labeling function defined in [6] that we depict by . Notice that, is not defined on the vertices of the game but on the vertices of the extended game associated with . Vertices in such a game are the vertices in equipped with a subset of players who have already visited their target set. This game is also a reachability game thus all concepts and definitions introduced in Section 2 hold. Moreover, there is a one-to-one correspondence between SPEs in and its extended game. This is the reason why we solve the different decision problems on the extended games , where , instead of . More details are given in [6]. However, it is very important to notice that some of our results depend on (resp. ) that are the number of vertices (resp. players) in and not in .

###### Theorem 4.2 ([6] Characterization of SPEs)

Let be a quantitative reachability game and be its extended game and let be a play in the extended game, the next assertions are equivalent:

1. there exists a subgame perfect equilibrium such that ;

2. the play is -consistent.

### 4.3 Sufficiency of lassoes

In this section, we provide technical results which given a -consistent play produce an associated -consistent lasso. In the sequel, we show that working with these lassoes is sufficient for the algorithms.

The associated lassoes are built by eliminating some unnecessary cycles and then identifying a prefix such that can be repeated infinitely often. An unnecessary cycle is a cycle inside of which no new player visits his target set. More formally, let be a play in , if and then the cycle is called an unnecessary cycle.

We call: (P1) the procedure which eliminates an unnecessary cycle, i.e., let such that is an unnecessary cycle, becomes and (P2) the procedure which turns into a lasso by copying long enough for all players to visit their target set and then to form a cycle after the last player has visited his target set. If no player visits his target set along , then (P2) only copies long enough to form a cycle. Notice that, given , applying (P1) or (P2) may involve a decreasing of the costs but for both and for (P2) . Additionally, applying (P1) until it is no longer possible and then (P2), leads to a lasso with length at most and cost less than or equal to for players who have visited their target set.

Additionally, applying (P1) or (P2) on -consistent play preserves this property. It is stated in Lemma 1 which is in particular true for extended games.

###### Lemma 1

Let be a quantitative reachability game and be a -consistent play for a given labeling function . If is the play obtained by applying (P1) or (P2) on , then is -consistent.

These properties on (P1) and (P2) allow us to claim that it is sufficient to deal with lassoes with polynomial length to solve Problems 1 and 3 for NEs and it give us some bounds on the needed memory and the costs for each problem.

###### Corollary 1

Let be an NE (resp. SPE) in a quantitative reachability game (resp. its extended game) and . Let (resp. ). If , then there exists an NE (resp. SPE) in (resp. ) such that:

• ;

• is a lasso such that ;

• for each , ;

• has memory in (resp. ).

###### Proposition 1

Let (resp. its extended game) be a quantitative reachability game and let be an NE (resp. SPE). Let (resp. ). If we have that is Pareto optimal in , then:

• for all ,

• there exists an NE (resp. SPE) such that , and .

### 4.4 Algorithms

In this section, we provide the main ideas behind our algorithms.

To solve Problem 1444As Problem 1 is already solved in PSPACE for SPEs [6] we here focus only on NEs. (resp. Problem 3) for NEs, we use Corollary 1 (resp. Proposition 1) which ensures that if there exists an NE which satisfies the conditions555Satisfying the conditions is either satisfying the constraints (Problem 1 and Problem 2) or having a cost profile which is Pareto optimal (Problem 3)., there exists another one with a lasso outcome of polynomial length. The algorithm works as follows:(i) it guesses a lasso of polynomial length;(ii) it verifies that the cost profile of this lasso satisfies the conditions given by the problem (resp. is Pareto optimal in ) and (iii) it verifies that the lasso is the outcome of an NE (Theorem 4.1). Notice that this latter step is done in polynomial time as the lasso has a polynomial length and the values of the coalitional games are computed in polynomial time.

To solve Problem 2 (resp. Problem 3 for SPEs), we use the algorithm designed for Problem 1. Each algorithm works as follows:(i) it guesses a cost profile ;(ii) it verifies that satisfies the conditions given by the problem and (iii) it checks, thanks to the algorithm for Problem 1, if there exists an equilibrium with cost profile smaller than (resp. equal to ).

Notice that for Problem 3, we need to have an oracle allowing us to know if is Pareto optimal. This leads us to study Problem 4 which lies in co-NP.

###### Problem 4

Given a reachability game (resp. its extended game ) and a lasso (resp. ), we want to decide if is Pareto optimal in (resp. ).

### 4.5 Results

Thanks to the previous discussions in Section 4.4, we obtain the following results. Notice that we do not provide the proof for the NP-hardness (resp. PSPACE-hardness) as it is very similar to the one given in [10] (resp. [6]).

###### Theorem 4.3

Let be a quantitative reachability game.

• For NEs: Problem 1 and Problem 2 are NP-complete while Problem 3 is NP-hard and belongs to .

• For SPEs: Problems 12 and 3 are PSPACE-complete.

###### Theorem 4.4

Let be a quantitative reachability game.

• For NEs: for each decision problem, if its answer is positive, then there exists a strategy profile with memory in which satisfies the conditions.

• For SPEs: for each decision problem, if the answer is positive, then there exists a strategy profile with memory in which satisfies the conditions.

• For both NEs and SPEs: (i) for Problem 1 and Problem 3, is such that: if , and (ii) for Problem 2, is such that: .

## References

• [1] Bouyer, P., Markey, N., Stan, D.: Mixed Nash equilibria in concurrent terminal-reward games. In: 34th International Conference on Foundation of Software Technology and Theoretical Computer Science, FSTTCS 2014, December 15-17, 2014, New Delhi, India. pp. 351–363 (2014)
• [2] Brenguier, R., Raskin, J.: Pareto curves of multidimensional mean-payoff games. In: CAV (2). Lecture Notes in Computer Science, vol. 9207, pp. 251–267. Springer (2015)
• [3] Brihaye, T., Bruyère, V., De Pril, J., Gimbert, H.: On subgame perfection in quantitative reachability games. Logical Methods in Computer Science 9(1) (2012)
• [4] Brihaye, T., Bruyère, V., Goeminne, A., Raskin, J.: Constrained existence problem for weak subgame perfect equilibria with -regular boolean objectives. In: Proceedings Ninth International Symposium on Games, Automata, Logics, and Formal Verification, GandALF 2018, Saarbrücken, Germany, 26-28th September 2018. pp. 16–29 (2018)
• [5] Brihaye, T., Bruyère, V., Goeminne, A., Raskin, J., van den Bogaard, M.: The complexity of subgame perfect equilibria in quantitative reachability games. To appear CONCUR 2019
• [6] Brihaye, T., Bruyère, V., Goeminne, A., Raskin, J., van den Bogaard, M.: The complexity of subgame perfect equilibria in quantitative reachability games. CoRR abs/1905.00784 (2019), http://arxiv.org/abs/1905.00784
• [7] Brihaye, T., De Pril, J., Schewe, S.: Multiplayer cost games with simple Nash equilibria. In: Logical Foundations of Computer Science, International Symposium, LFCS 2013, San Diego, CA, USA, January 6-8, 2013. Proceedings. pp. 59–73 (2013), https://doi.org/10.1007/978-3-642-35722-0_5
• [8] Brihaye, T., Geeraerts, G., Haddad, A., Monmege, B.: Pseudopolynomial iterative algorithm to solve total-payoff games and min-cost reachability games. Acta Inf. 54(1), 85–125 (2017)
• [9] Bruyère, V.: Computer aided synthesis: A game-theoretic approach. In: Developments in Language Theory - 21st International Conference, DLT 2017, Liège, Belgium, August 7-11, 2017, Proceedings. pp. 3–35 (2017)
• [10] Condurache, R., Filiot, E., Gentilini, R., Raskin, J.F.: The Complexity of Rational Synthesis. In: Chatzigiannakis, I., Mitzenmacher, M., Rabani, Y., Sangiorgi, D. (eds.) 43rd International Colloquium on Automata, Languages, and Programming (ICALP 2016). Leibniz International Proceedings in Informatics (LIPIcs), vol. 55, pp. 121:1–121:15. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, Dagstuhl, Germany (2016)
• [11] Conitzer, V., Sandholm, T.: Complexity results about Nash equilibria. CoRR cs.GT/0205074 (2002), http://arxiv.org/abs/cs.GT/0205074
• [13]

Khachiyan, L., Boros, E., Borys, K., Elbassioni, K., Gurvich, V., Rudolf, G., Zhao, J.: On short paths interdiction problems: Total and node-wise limited interdiction. Theory of Computing Systems

43(2), 204–233 (Aug 2008)
• [14] Nash, J.F.: Equilibrium points in -person games. In: PNAS. vol. 36, pp. 48–49. National Academy of Sciences (1950)
• [15]

Osborne, M.: An introduction to game theory. Oxford Univ. Press (2004)

• [16] Pnueli, A., Rosner, R.: On the synthesis of a reactive module. In: POPL. pp. 179–190. ACM Press (1989)
• [17] Thomas, W.: On the synthesis of strategies in infinite games. In: Mayr, E.W., Puech, C. (eds.) STACS 95. pp. 1–13. Springer Berlin Heidelberg, Berlin, Heidelberg (1995)
• [18] Ummels, M.: Rational behaviour and strategy construction in infinite multiplayer games. In: FSTTCS. Lecture Notes in Computer Science, vol. 4337, pp. 212–223. Springer (2006)
• [19] Ummels, M.: The complexity of Nash equilibria in infinite multiplayer games. In: Foundations of Software Science and Computational Structures, 11th International Conference, FOSSACS 2008, Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2008, Budapest, Hungary, March 29 - April 6, 2008. Proceedings. pp. 20–34 (2008)

## Appendix 0.A Complements to Section 3

### 0.a.1 Proof of Theorem 3.1

To prove Theorem 3.1, we begin with a preliminary lemma and the proof of Theorem 3.1 follows.

###### Lemma 2

Let be a quantitative reachability game. Then for all for which some target set , , is reachable from , there exists an SPE in whose outcome visits at least one target set , , that is, .

###### Proof

By Theorem 2.1 in [3], there exists an SPE in for each initial vertex . Consider the set of vertices for which some is reachable from , and the set of those vertices for which there is an SPE in that visits at least one target set. We have to prove that .

Assume the contrary and let . We claim that there exists an edge such that and . Indeed as , there exists a history with for some . Hence since the outcome of all SPEs in immediately visits . As along we begin with and we end with , there must exist an edge with and .

Let (resp. ) be an SPE in (resp. in ). As , we can suppose that the outcome of visits some target set . From and , we are going to construct another SPE in whose outcome will now visit this set . This will lead to a contradiction with . We define such a strategy profile equal to except that it is replaced by for all histories with prefix . More precisely,

• for the particular history , if , then ,

• for each history , , we define ,

• for each history , , with , we define .

Clearly the outcome of is equal to and thus visits . It remains to show that is an SPE, i.e., that is an NE in the subgame for all , .

• For all histories that begin with with , clearly is an NE in because and is an SPE.

• Take any history that begin with , and let . Let be a deviating strategy for player  in . By definition of we have

 ⟨τ↾h⟩v = u⟨σu′↾h′⟩v ⟨(τ′i,τ↾h,−i)⟩v = u⟨(τ′i,σu′↾h′,−i)⟩v

Moreover, as belongs to no target set, we have for all plays . It follows that if is a profitable deviation for player  with respect to , it is also a profitable deviation with respect to . The latter case never holds because is an SPE (and in particular is an NE). Therefore is an NE in .

• It remains to consider the history and to prove that is an NE in . From what has been gathered so far, only player such that might have a profitable deviation by deviating at the initial vertex with a strategy such that . Notice that since , we have and since is an SPE (and in particular an NE), we have . Moreover as and by definition of , we have . It follows that is not a profitable deviation for player  with respect to , and then is an NE in .

###### Proof (of Theorem 3.1)

Let , with , be an initialized quantitative reachability game such that its arena is strongly connected. Assume by contradiction that there exists no SPE in whose outcome visits all target sets , , that are non empty. By Theorem 2.1 in [3], there exists an SPE in , and we take such an SPE whose outcome visits a maximum number of target sets, say . Thus by assumption there exists at least one with that is not visited by . Thanks to Lemma 2, we are going to define from another SPE in whose outcome visits all as well as an additional target set. This will lead to a contradiction.

Consider a prefix of that visits all . We denote it by with . From we define the quantitative reachability game with the same arena and such that if and otherwise ( is defined with respect to as in Definition 1). Notice that is not empty and it is reachable from since is strongly connected. Therefore by Lemma 2, there exists an SPE in that visits at least one target set . From and , we define a strategy profile in as follows: let ,

• if for some , then ,

• otherwise .

Thus, acts as , except that after a history beginning with , it acts as . Clearly the outcome of is equal to and thus visits in addition to . It remains to show that is an SPE. Consider , , and let us show that is an NE in .

• If neither is a prefix of nor is a prefix of , then by definition of , and is an NE in because is an SPE in .

• If is a prefix of , let such that . Suppose first that visits , then player  has clearly no incentive to deviate in . Suppose now that does not visit , then and by definition of . Hence for all plays in that start in , is a play in that start in , and we have . Hence by definition of , a profitable deviation for player  with respect to