1. Introduction
In social situations, a queue acts in a selfstabilising manner: when anyone tries to jump the queue, the others give him a dirty look, and often this suffices to enforce the rule. Distributed protocols are often designed to be selfstabilising in a similar sense: when a fault occurs, it is detected and isolated, and the system recovers to a state in such a way that the computation can proceed.
In general, the design of such systems assumes that all processes cooperate and coordinate their actions towards achieving the system goals. When a process deviates from the protocol (perhaps due to faults), the system needs to detect it and recover from the situation. Usually, if such deviation occurs in an isolated oneoff manner, it may be hard to detect, but when they occur repeatedly, there is the possibility of detecting the culprit(s). This idea has been applied to intruder detection in security theory.
There are several interesting variations on this theme. One is to ask for protocols or solutions that do not demand detection of deviators but to merely provide resilience. Typical solutions in distributed computing assume a bound on the number of faulty processes and provide solutions that can tolerate that many failures. Another consideration relates to the observational limitations of processes. In a distributed system, processes have only a partial view of the system state, and often cannot observe all the moves of other processes, and these may help the deviator to evade detection successfully.
When distributed computing meets game theory, we have further interesting possibilities to consider. A player may act selfishly if it maximises her payoff, even if this means a deviation. In general the coalition may have no way to hold a member of the coalition to act according to prior commitments (dismissively labelled “cheap talk”). However, in the case of repeated play, there can be threats and punishment to ensure that members do not deviate. In this context, a variety of mechanisms are studied in game theory, notable among them the
grimtrigger strategies: start out cooperating, and when any player deviates, punish for ever after. The importance of such strategies is that they play a central role in what are referred to as Folk theorems in game theory.Nash equilibria provide a robust way to predict how rational players would act offering their best responses to their beliefs about how others might act, in situations where players differ in their knowledge of what others might be doing. At Nash equilibrium, no player has an incentive to deviate unilaterally and shift to a different strategy. Folk theorems then assert that for any Nash equilibrium outcome vector
in an player infinitely repeated game (with, say, limit average rewards), then each player can force the outcome . Conversely every “feasible” and enforceable outcome is the outcome of some Nash equilibrium.Interestingly, these (and many related) results depend on the ability of players to perfectly monitor the actions of the deviator. In cases where players’ actions may not be directly observable, special solutions are needed, and game theorists have developed a powerful set of techniques for many subclasses of games [9, 12, 1, 14]. Note that this situation pertains more directly to distributed systems where processes are limited in their ability to observe the global state and to record other processes’ actions.
In this paper, we focus on computational questions related to deviator detection under imperfect monitoring. Can we have algorithms that determine, in a game, whether the deviator’s identity be rendered common knowledge ? If yes, we could like to construct strategies (for the other players) to achieve this. This problem is related to solving consensus problems on graphs with different communication topologies.
We propose a general technique based on methods from automated synthesis towards achieving an epistemic objective (involving common knowledge). With imperfect information, the synthesis problem is undecidable already with simple reachability objectives. So one expects only negative results for deviator detection as well. And indeed, deviator detection under imperfect information of game state is in general undecidable.
However, it is interesting to note that the essence of the deviator detection problem lies in monitoring of player actions and not necessarily game state. Indeed, in repeated games we often have no states at all ! We show in this paper that in such as case, the problem is tractable. The main idea is that the problem can then be studied in the setting of coordination games of incomplete information with finitely many states of nature. These are systems with perfect monitoring of state, and uncertainty for a player comes only from unobserved actions of other players. But then, these are games with bounded initial uncertainty that can only reduce as play progresses, and this observation leads us to an algorithmic solution.
While the main result of the paper is the assertion that deviator detection is decidable under imperfect monitoring of actions (with only a finite amount of uncertainty about the state), we see the contribution of the paper as twofold: to highlight a setting in game theory that is of interest to distributed computing; and to illustrate the use of epistemic objectives, that is of interest to games as well as distributed systems. We also suggest that methods from automated synthesis may offer new ways of describing sets of equilibrium solutions and for constructing equilibria, which could be of technical interest for bridging game theory and distributed systems [11].
2. Games
We model distributed systems as infinite games with finitely many states. There is a finite set of players. We refer to a list with one element for every player as a profile. For any such profile, we write to denote the list where the component of Player is omitted; for element and a list , we denote by the full profile .
Game structure.
To describe the game dynamics, we fix, for each player , a set of actions, a set of observations, and a set of local states — these are finite sets. We denote by , , and , the set of all profiles of actions, observations, and local states; a profile of local states is also called global state. Now, the game form is described by its transition function .
The game is played in stages over infinitely many periods starting from a designated initial state known to all players. In each period , starting in a state , every player chooses an action . Then the transition determines the observation received privately by each player , and the the global successor state , from which the play proceeds to period .
Thus, a play is an infinite sequence following the transitions for all . A history is a finite prefix of a play. We refer to the number of stages played up to period as the length of the history. The sequence of observations received by player along a history is denoted by . We assume that each player always knows his local state and the action she is playing, that is, these data are included in her private observation received in each round. However, she is not perfectly informed about the local states or the actions of the other players, therefore we speak of imperfect monitoring. The monitoring function induces an indistinguishability relation between histories and plays: if, and only if, . This is an equivalence relation between game histories; its classes are called the information sets of player .
A strategy for player is a mapping that prescribes an action for every observation sequence. Again, we denote by the set of all strategy profiles. We say that a history or a play follows a strategy , if , for all histories of length in . Likewise, a history or play follows a profile , if it follows the strategy of each player . The outcome of a strategy profile is the unique play that follows it.
With the above definition of a strategy, we implicitly assume that players have perfect recall, that is, they may record all the information acquired along a play. Nevertheless, in certain cases, we can restrict our attention to strategy functions computable by automata with finite memory. In this case, we speak of finitestate strategies.
Strategy synthesis.
The task of automated synthesis is to construct finitestate strategies for solving games (presented in a finite way). Depending on the purpose of the model, the notion of solving has different meanings.
One prominent application area in distributed systems is concerned with synthesising coordination strategies for a coalition with common interests against a fixed adversary — the environment, or Nature [13, 17, 8]. For this purpose, we assume the coalition to be the set excluding a designated player . We are interested in win/lose games. The winning condition is described as a set of plays; a basic example are reachability winning conditions, which consist of all plays that reach a designated set of global states. Here, a solution is a distributed winning strategy for the coalition: a profile such that for all .
The distributed synthesis problem for coordination strategies asks: for a given game, determine whether there exists a profile of finitestate strategies for that is winning against player . This problem is well known to be undecidable, already for games with reachability winning conditions.
Theorem 1 ([16]).
The distributed synthesis problem is undecidable for reachability games for two players with imperfect information against an adversary.
If we consider nonzero sum games, where we have players with different, possibly overlapping objectives, a standard solution concept is that of Nash equilibrium. A theory of synthesis for such games is being developed in the last decade [10, 5, 4]. While there are several positive results for games of perfect information, synthesis questions are difficult in the context of imperfect information.
In this paper, we are interested in a specific case that lies between the approaches of distributed coordination and Nash equilibrium: synthesis of strategies for detecting an unknown adversary —if he should arise— under conditions of imperfect monitoring. We now proceed to define and address this problem.
3. Deviator detection
Unlike the traditional setting concerned with temporal objectives on actions or states assumed in a play, we are interested here in epistemic objectives which refer to attaining knowledge that a certain event has occurred. For an introduction to knowledge in distributed systems, we refer to the book of Fagin, Halpern, Moses, and Vardi [7, Ch. 2].
Let us fix a game with the usual notation. An event is a subset of histories in . The event occurs at history if . We say that (the occurrence of) is private knowledge of Player at history , if for any , it holds that . Further, an event is common knowledge among the players of a coalition at history if, for every sequence of histories and players such that , it is the case that .
Specifically, we are interested in the event that a player has deviated from a given play. To describe this, we define for each play and for every player , the event consisting of all histories that disagree with such that, for the first round where the prefixes and disagree, they differ only in the action of player . (Since the transition function is deterministic and the initial state is fixed, the first difference between two histories can only occur at an action profile). Obviously, the sets are suffix closed, that is, for each , any prolongation history belongs to as well. Likewise, if a coalition attains common knowledge of at a history , it attains common knowledge of at every prolongation history of .
Now, let be a designated set of plays in . A deviator detection strategy with respect to is a strategy profile such that:

the outcome belongs to , and

for each player and every strategy , if the outcome disagrees with , then the coalition attains common knowledge of at some history of .
The synthesis problem for deviator detection is the following: given a game with a target set specified by a finitestate automaton, decide whether there exists a finitestate strategy profile for deviator detection with respect to and, if so, construct one.
3.1. Deviator detection as a coordination problem
Alternatively, we can cast the deviator detection problem as a more standard problem of distributed synthesis with temporal objectives. Informally, this is done by adding a new player —Nature— that can either remain silent, or at any point take over the identity of an actual player, deviate from his intended action, and continue playing on his behalf. The deviation of Nature takes the game into a fresh copy associated to the corrupted player where the only way to win for the remaining coalition is by issuing a simultaneous action in which they all expose the identity of ; however, if the exposure action is not taken in consensus by all players, except for , the game is lost.
More precisely, we transform the deviator detection game into a coordination game against Nature —let us call it exposure game— as follows: First, we add for each player , actions that allow to expose a deviation by player , that is, we set ; we use the shorthand to denote any action profile where the coalition chooses in consensus. Further, we involve a new player with actions that allow him to stay silent (by choosing ) or to corrupt the action of any other player, that is . His local states are . Moreover, we include (global) sink states and . The observation sets remain unchanged. The transitions of the new game follow the original transitions as long as Nature stays silent: for . When Nature decides to deviate from the intended action of a player , his local state changes from to : for and ; for the transitions in the game copy where player is corrupted, we set for , and ; all other moves involving exposure actions lead to . The (temporal) winning condition of the new game consists of the plays in where Nature remained silent and of the plays that reach .
We can analyse the exposure game in terms of the Knowledge of Preconditions principle formulated by Moses in [15]. Once a deviation occurred, the coalition can win only by reaching via a simultaneous consensus action which requires common knowledge of the identity of the deviator. Conversely, deviator detection strategies in can be readily used to win the exposure game.
Lemma 2.
Every coordination strategy for the coalition in the game against corresponds to a deviatordetection strategy with respect to in , and vice versa.
3.2. General undecidability
The translation of deviator detection games into coordination games shows the problem under a different angle, but it does not bring us closer to an algorithmic solution. In the general setting of imperfect monitoring, we obtain coordination games between multiple players with imperfect information, for which the synthesis problem is undecidable, as pointed out in Theorem 1.
Indeed, it turns out that under imperfect monitoring, detecting deviators is no easier than coordinating against an opponent to reach a target set.
Theorem 3.
The synthesis problem for deviator detection strategies is undecidable for games with imperfect monitoring.
Proof.
Consider an arbitrary coordination game with three players , , and where the coalition seeks to reach a set of states under imperfect monitoring. We reduce the synthesis problem for this game to one in a deviator detection game among four players , , and , in which play the same role as in whereas and both take the role of player . The new game contains two disjoint copies , of ; the actions of are ignored in the former, and those of in the latter. The game starts in a fresh state at which it loops with a fixed action profile; the designated set consists only of this looping play. The actions of player and at this state are perfectly observable to all players, so any deviation from the loop in is detected instantly. In contrast, the deviations of player or generate the same (fresh) observation to and , and they lead to the initial state of and , respectively. These two component games evolve in the same way with the only difference that, when switching to a target state of in , the observation is sent to all players, whereas the observation is sent when reaching in .
Thus, for any deviator detection strategy , upon deviation of either or from the initial loop, players and must coordinate to reach the target set to identify the deviator. Hence, yields a solution of the coordination problem. Conversely, any solution of the coordination problem leads to a state in where the deviator is revealed, so it provides a deviator detection strategy. Since the synthesis problem for coordination problems is undecidable, according to Theorem 1, it follows that the synthesis problem for deviator detection strategies is also undecidable. ∎∎
4. Perfect monitoring of states
As we could see in the previous section, the algorithmic intractability of games where the global state can be hidden from the players over an unbounded duration of time is preserved when we move from coordination to the deviator detection problem. However, our setting bears two sources of uncertainty: the global state and the played action. In this section we consider the case where the uncertainty comes only from the actions played by the other players. Indeed, this is a generalisation of the setting of infinitely repeated games, which can be seen as games with only one global state.
As an example, consider the following simple variant of a beeping model [6]. There are nodes in a network represented by an undirected graph. The nodes can communicate synchrounously. In every round, a node can either beep or stay silent. A silent node can observe whether at least one of its neighbour beeped. We assume that the network is commonly known, we are interested in distributed protocols under wich some temporal condition is ensured, e.g., no more than a quarter of the nodes beep in the same round, and that are additionally deviator proof, in the sense that whenever a node deviates from the protocol, the protocol followed by the remaining nodes allows to reach a consensus on the identity of the deviator. This question can be represented as a deviatordetection problem among players, each with two actions – beep or stay silent – and two observations, telling whether any neighbour beeped or not in the current round. As the effect of an action profile is the same in any round, the game has only one global state. Still, the synthesis problem shows some complexity. Partly, this is due to the structure of the observation functions encoded by the network graph. For instance, one can observe that no deviator detectionstrategy can exists on networks that are not twoconnected: Any deviation has to be detected by at least two witnesses, and every node that is not a direct witness needs to be finally informed via at least two disjoint paths. But the greater challenge comes from the dynamics of communicating the identity of the deviator: In contrast to the more traditional synthesis problems for temporal conditions, whether a play is successful is not determined by the strategic choices taken along itself, but also depends on the choices taken on histories connected to via the player’s indistinguishability relations.
Concretely, we consider games that allow perfect monitoring of the state in the sense that for every observation sequence received by any player along a history, there exists precisely one global state that is reachable by a history with observation . In other words, all histories in an information set of a player end at the same global state. The condition is obviously met if we include the current global state in the observation of each player. Indeed, every finite game with perfect monitoring of the state can be transformed effectively into one where all the players can observe the current state.
Our main technical result establishes that, under perfect monitoring of states, the deviator detection problem is algorithmically tractable in spite of imperfect private monitoring of actions.
Theorem 4.
The synthesis problem for deviatordetection strategies is effectively solvable for games with perfect monitoring of the state and imperfect monitoring of actions.
The proof relies on a more general result which states, informally, that the synthesis problem for coordination games with a bounded amount of uncertainty are algorithmically tractable. To formulate this more precisely, let us fix a set of players with their sets of actions, observations, and local states; the set includes a designated player Nature. Consider a finite collection of games with perfect monitoring of the state over the fixed action, observation and state spaces, together with a winning condition common to all these games. We define the sum game over the collection as a game with a fresh initial node at which Nature chooses the initial node of any of the games in the collection; no information about this move is delivered to the other players in . Note that the sum may not allow perfect monitoring of the state. For this sum game , we consider the task to synthesise a coordination strategy for the coalition to ensure either that the outcome is either winning with respect to or it reveals the initial choice of Nature. That is, we require, for every play which follows the strategy, that at some history in , the players attain common knowledge about the component game that has been chosen. We call this a revelation game over with condition .
In gametheoretic terminology, the sum game constructed above is actually a game of incomplete
information — the uncertainty about the global state of the game is due to not knowing which of the finitely many component games is being played. Nevertheless, as the component games deliver different observations, the players may be able to recover this missing information. It turns out that this restricted form of imperfect information is algorithmically tractable. The setting is similar to that of multienvironment Markov decision processes studied in
[18].Theorem 5.
The synthesis problem is effectively solvable for revelation games on components with perfect monitoring of the state. Moreover, the set of all winning strategies admits a regular representation.
The idea is to keep track of the knowledge that players have about the index of the actual component game while the play proceeds. This knowledge can be represented by epistemic structures similar to the ones used in [2]. Here it is sufficient to consider epistemic structures on a subset of component indices, with epistemic equivalence relations relating indices , whenever player considers it possible to be in component if the actual history is in component . The tracking construction associates to each history a structure that is strongly connected via these relations; we call this structure the epistemic state of .
Intuitively, the construction represents the actions in the original game abstractly by their effect on the uncertainly about the component. In contrast to the concrete actions in the game, which can be monitored only imperfectly, the abstract updates on the epistemic structure can be monitored perfectly; the resulting game is thus solvable with standard methods as one with perfect information, where the winning condition asks to satisfy or to reach an epistemic structure over a singleton, representing that the players attain common knowledge about the actual component. The perfectinformation solution yields a regular representation of the set of winning strategies over a product alphabet of global game states and epistemic states. In the full paper, we show that this abstraction can be maintained by a finitestate construction and that it allows to represent a solution whenever one exists.
To prove Theorem 4 using the result of Theorem 5, we view the deviator detection problem as a revelation game. Towards this, we adapt the exposure game constructed in Subsection 3.1 for a given deviator detection problem with target set . The exposure game is already close to the setting of revelation games, but there is one twist: Besides choosing the deviator, in the exposure game Nature can also choose the period in which to deviate. To account for this, we transform by letting Nature pick a candidate deviator in the first move; this choice is hidden from the other players. In every later round, Nature can choose to either remain silent or to corrupt the action of player . As a target condition for the new game, we fix the set of all plays in the target set where Nature remained silent. The obtained revelation game has the same set of solutions as the deviator detection problem at the outset.
5. Conclusion
Thus what we have here is a building block for constructing equilibria in games based on epistemic states. We are interested primarily in the issue of detecting a deviation from an agreed strategy profile. This task is more specific than constructing equilibria by detecting deviations from the set of distributed strategies that ensure a win, as done for instance, in [3]. The crucial difference relies in the fact that, in the latter case, the deviation events can be described by a (regular) set of game histories, while in our setting, the notion of deviation is relative to a strategy profile that is not fixed within the game structure. To illustrate the difference, consider the example of a beeping model from 4 with the trival target set that contains all possible plays. Obviously, every strategy profile is an equilibrium here, but deviatordetection strategies remain nevertheless intricate.
Our abstraction from imperfect monitoring of actions to games with perfect information is fairly generic. The central clue is that if there is only a finite amount of information hidden in the beginning of a play, and we can decide whether the coalition can recover it. In the context of distributed systems, there is a wide variety of situations that involve only imperfect observation of actions, and where uncertainty about system states may be bounded. Hence we reasonably expect that these techniques will be applicable, not only for deviator detection, but in other algorithmic questions on inferring global information in such systems.
Acknowledgements
This work was supported by the IndoFrench Joint Research Unit ReLaX (umi cnrs 2000).
References
 [1] Massimiliano Amarante, Recursive structure and equilibria in games with private monitoring, Economic Theory 22 (2003), no. 2, 353–374.
 [2] Dietmar Berwanger, Łukasz Kaiser, and Bernd Puchala, A perfectinformation construction for coordination in games, Proceedings of Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2011), LIPIcs, vol. 13, SchlossDagstuhl – LeibnizZentrum für Informatik, 2011, pp. 387–398.
 [3] Patricia Bouyer, Romain Brenguier, Nicolas Markey, and Michael Ummels, Concurrent games with ordered objectives, Foundations of Software Science and Computational Structures FOSSACS 2012. Proc., 2012, pp. 301–315.
 [4] Patricia Bouyer, Romain Brenguier, Nicolas Markey, and Michael Ummels, Pure Nash equilibria in concurrent games, Logical Methods in Computer Science 11 (2015), no. 2:9.
 [5] Krishnendu Chatterjee, Laurent Doyen, Emmanuel Filiot, and JeanFrançois Raskin, Doomsday equilibria for omegaregular games, Verification, Model Checking, and Abstract Interpretation, LNCS, vol. 8318, Springer, 2014, pp. 78–97.
 [6] Alejandro Cornejo and Fabian Kuhn, Deploying wireless networks with beeps, Distributed Computing: 24th International Symposium, DISC 2010. Proceedings (Nancy A. Lynch and Alexander A. Shvartsman, eds.), Springer, 2010, pp. 148–162.
 [7] Ronald Fagin, Joseph Y. Halpern, Yoram Moses, and Moshe Y. Vardi, Reasoning about knowledge, MIT Press, 1995.
 [8] B. Finkbeiner and S. Schewe, Uniform distributed synthesis, Proc. of Logic in Computer Science (LICS’05), IEEE, 2005, pp. 321–330.
 [9] Drew Fudenberg, David I Levine, and Eric Maskin, The Folk Theorem with Imperfect Public Information, Econometrica 62 (1994), no. 5, 997–1039.
 [10] Julian Gutierrez and Michael Wooldridge, Equilibria of concurrent games on event structures, Computer Science Logic (CSL) and Logic in Computer Science (LICS) (New York, NY, USA), CSLLICS ’14, ACM, 2014, pp. 46:1–46:10.
 [11] Joseph Y. Halpern, Computer science and game theory: A brief survey, CoRR abs/cs/0703148 (2007).
 [12] Michihiro Kandori and Hitoshi Matsushima, Private observation, communication and collusion, Econometrica 66 (1998), no. 3, pp. 627–652.
 [13] Orna Kupferman and Moshe Y. Vardi, Synthesizing distributed systems, Proc. of LICS ’01, IEEE Computer Society Press, June 2001, pp. 389–398.
 [14] George Mailath and Larry Samuelson, Repeated games and reputations: Longrun relationships, Oxford University Press, 2006.
 [15] Yoram Moses, Relating knowledge and coordinated action: The knowledge of preconditions principle, Theoretical Aspects of Rationality and Knowledge, TARK 2015, Proc., EPTCS, vol. 215, 2015, pp. 231–245.
 [16] Gary L. Peterson and John H. Reif, MultiplePerson Alternation, Proc 20th Annual Symposium on Foundations of Computer Science, (FOCS 1979), IEEE, 1979, pp. 348–363.
 [17] Amir Pnueli and Roni Rosner, Distributed reactive systems are hard to synthesize, Proceedings of the 31st Annual Symposium on Foundations of Computer Science, (FoCS ’90), 1990, pp. 746–757.
 [18] JeanFrancois Raskin and Ocan Sankur, MultipleEnvironment Markov Decision Processes, Foundation of Software Technology and Theoretical Computer Science (FSTTCS 2014), LIPIcs, vol. 29, Schloss Dagstuhl – LeibnizZentrum fü Informatik, 2014, pp. 531–543.