Model checking coalitional games in shortage resource scenarios

07/17/2013 ∙ by Dario Della Monica, et al. ∙ Reykjavik University University of Salerno 0

Verification of multi-agents systems (MAS) has been recently studied taking into account the need of expressing resource bounds. Several logics for specifying properties of MAS have been presented in quite a variety of scenarios with bounded resources. In this paper, we study a different formalism, called Priced Resource-Bounded Alternating-time Temporal Logic (PRBATL), whose main novelty consists in moving the notion of resources from a syntactic level (part of the formula) to a semantic one (part of the model). This allows us to track the evolution of the resource availability along the computations and provides us with a formalisms capable to model a number of real-world scenarios. Two relevant aspects are the notion of global availability of the resources on the market, that are shared by the agents, and the notion of price of resources, depending on their availability. In a previous work of ours, an initial step towards this new formalism was introduced, along with an EXPTIME algorithm for the model checking problem. In this paper we better analyze the features of the proposed formalism, also in comparison with previous approaches. The main technical contribution is the proof of the EXPTIME-hardness of the the model checking problem for PRBATL, based on a reduction from the acceptance problem for Linearly-Bounded Alternating Turing Machines. In particular, since the problem has multiple parameters, we show two fixed-parameter reductions.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Verification of multi-agents systems (MAS) is a topic under investigation by several research groups in computer science in the last ten years ([Dastani:2010:SVM:1855030]). Most of the research is based on logical formalisms, maybe the most famous being the Alternating-time Temporal Logics () [AHK02] and the Coalition Logic () [Pauly01, Pauly02], both oriented towards the description of collective behaviors and used as specification languages for open systems. These scenarios are hence naturally modeled as games. In [Goranko01] it has been shown that can be embedded into . Recently, these two logics have been used for the verification of multi-agent systems (MAS), enhanced with resource constraints [ALNR09, ALNR10, BNFN09, BF10, dnp11]. The intuitive idea is that agent actions consume and/or produce resources, thus the choice of a given action of an agent is subject to the availability of the resources. In [ALNR09], Alechina et al. introduce the logic Resource-Bounded Coalition Logic (), whose language extends the one of with explicit representation of resource bounds. In [ALNR10], the same authors propose an analogous extension for , called Resource-Bounded Alternating-time Temporal Logics (), and give a model checking procedure that runs in time , where is the length of the formula to be checked, is the size of the model , and is the number of resources. However, the problem of determining a lower bound to the model checking problem is left open. In [BF10], Bulling and Farwer introduce two Resource-Bounded Agent Logics, called and . The former represents a generalization of Alechina et al.’s , the latter is an analogous extension of (analogous extensions for, respectively, and were presented by the same authors in [BNFN09]). The authors study several syntactic and semantic variants of and with respect to the (un)decidability of the model checking problem. In particular, while previous approaches only conceive actions consuming resources, they introduce the notion of actions producing resources. It turned out that such a new notion makes the model checking problem undecidable. Formulae of the formalisms proposed in [ALNR09, ALNR10, BNFN09, BF10] allow one to assign an endowment of resources to the agents by means of the so-called team operators (borrowed from ). The problem is then to determine whether the agents in the proponent team have a strategy for the game to carry out the assigned goals with that bounded amount of resources, whatever the agents in the opponent team do.

In this paper we study a different formalism, called Priced Resource-Bounded Alternating-time Temporal Logic (), introduced in [dnp11], but in a much less mature version. The key features of this new approach toward the formalization of such complex systems can be summarized as follows.

  • Boundedness of the resources. This is a crucial point in our formalization. In order to model boundedness of the resources, a notion of global availability of resources on the market (or in nature), which evolves depending on both proponent and opponent behaviors, is introduced. Such a global availability is a semantic component (it is part of the structure where the logic is interpreted) and its evolution is tracked during the executions of the system. Agents’ moves are affected by the current global availability (e.g., agents cannot consume an unbounded amount of resources).

  • Resources are shared. Resources are global, that is, they are shared by all the agents. Thus, the agents either consume or produce resources out of a shared pool of bounded capability, and acquisition (resp., release) of a resource by an agent (independently if the agent belongs to the proponent or opponent team) implies that the resource will be available in smaller (resp., greater) quantity. In this way, we can model several scenarios where shared resources are acquired at a cost that depends on that resource current availability (for example in concurrent systems where there is a competition on resources).

  • Money as a meta-resource. In addition to public shared resources, our setting also allows one to model private resources, that is, resources that are possessed by agents (public resources are present in the market and will be acquired by the agents in case they need). The idea is to provide the agents with the unique private resource, money, that can be used to acquire (public) resources needed to perform the tasks. In this sense, money represent several resource combinations and can be considered as a meta-resource. Unlike the other resources, it is a syntactic component (money endowment is part of the formula), and is the only (meta-)resource which is private for an agent.

    At this stage, our formalization only features the possibility of assigning to agents one private resource. Nevertheless, in principle, it is possible to extend the idea to admit a vector of private resources. Furthermore, one could think of including the same resource in both the pool of public resources and in the pool of private ones. For instance, in a car race one of the players (the cars) possesses some gasoline in the tank (private resource) but he needs to acquire more gasoline at the gas station (public resource) to complete the race.

  • Resource production. Production of resources is allowed in a quantity that is not greater than a fixed amount. Thus, we extend the model still preserving the decidability of the model checking problem. Observe that the constraint we impose still allows us to describe many interesting real-world scenarios, such as acquiring memory by a program, or leasing a car during a travel, or, in general, any release of resources previously acquired. A similar setting has been already observed also in [BF10].

  • Opponent power. First observe that we use the standard terminology which separates the role of the agents in a proponent team and those in the opponent team. This distinction is not within the game structure, but it is due to the formula under consideration. Agents of the opponent team are subject to resource availability in choosing the action to perform, in the same way as the agent of the proponent team, thus the opponent team cannot interfere with a proponent strategy performing actions which either consume or produce too much (see Example 3 in Section 3

    ). However, it is common practice to consider opponent having maximum power, to look for robust strategy. We give unlimited economic power to the agents in the opponent team, in the sense that at each moment they have money enough to acquire the resources they need for a move, provided that the resources are available.

Actually in [dnp11] an EXPTIME algorithm for the model checking problem was given, along with a PSPACE lower bound. The main technical contribution here is to provide an EXPTIME lower bound for the model checking problem for . This result shows that the model checking problem for this logic is EXPTIME-complete. The hardness proof is obtained by means of a reduction from the acceptance problem for Linearly-Bounded Alternating Turing Machines (), known to be EXPTIME-complete [CKS81], to the model checking problem for . More precisely, let be the number of agents, the number of resources, and the maximum component occurring in the initial resource availability vector, the algorithm given in [dnp11] runs in exponential time in , , and the size of the representation of (assuming that is represented in binary). To prove here the inherent difficulty with respect to multiple input parameters, we show two reductions: one parametric in the representation of (the digit size), that assumes constant both and , and another parametric in , and assuming constant both and the value of .

2 Comparison with related works

In this section we compare our approach with the existing literature underlining differences and similarities respect to  [ALNR10] and [BF10].

In the work by Alechina et al. [ALNR10], resource bounds only appear in the formulae and are applied solely to the proponent team, but they are not represented inside the model. Indeed, agents of the proponent team are endowed with new resources at the different steps of the system execution. This means that it is possible to ask whether a team can reach a goal with a given amount of resources, but it is not possible to keep trace of the evolution of the global availability of resources. Moreover, resources are private to agents of the proponent team (not shared, as in our approach) and resource consumption due to the actions of the opponent is not controlled. Here instead, we keep trace of resource global availability, whose evolution depends on both proponent and opponent moves. In this way, it is possible to avoid undesired/unrealistic computations of the system such as, for instance, computations consuming unboundedly. Let us see a very simple example. Consider the formula . Its semantics is that agents in team have together a strategy which can guarantee that always holds, whatever agents of the opponent team do (without consuming too many resources) and provided the expense of the agents in does not exceed . A loop in the structure where the joint actions of agents consume resources without producing them, cannot be a model for . On the contrary, consider the formula , belonging to the formalism proposed in [ALNR10], expressing a similar property, with the only difference that the agents of use an amount of resources bounded by . A model for must contain a loop where the actions of agents in do not consume resources, but the actions of agents in the opponent team may possibly consume resources, leading to an unlimited consumption of resources.

As a further difference, recall that in [ALNR10] actions can only consume resources. Without resource productions, the model for many formulae (for example those containing the global operator ) must have a loop whose actions do not consume resources (do-nothing actions), and a run satisfying these formulae is eventually formed by only such actions. On the contrary, by allowing resource production, we can model more complex situations when dealing with infinite games.

Finally, observe that a similarity with the cited paper is in the role of money, that could be seen as a private resource, endowed to the agents of the proponent team.

Bulling and Farwer [BF10] adopted an “horizontal” approach, in the sense that they explored a large number of variants of a formalism to model these complex systems. In particular, they explored the border between decidability and undecidability of the model checking problem for all such variants, and they showed how the status of a formalisms (wrt decidability of its model checking problem) is affected by (even small) changes in language, model, and semantics. Our work takes advantage of this analysis in order to propose a logic that captures several desirable properties (especially concerning the variety of natural real world scenario that is possible to express), still preserving decidability. However, our approach presents conceptual novelties that make it difficult to accomplish a direct comparisons between the formalisms presented here and the ones proposed in [BF10]. We are referring here to both the above mentioned idea of dealing with resources as global entities for which agents compete, and the notion of cost of resource acquisition (price of the resources) that dynamically changes depending on the global availability of that resource (thus allowing one to model the classic market law that says that getting a resource is more expensive in shortage scenario). In [BF10], there is no such a notion as resources are assigned to (team of) agents and proponent and opponent do not compete for their acquisition.

As regards the complexity issue, in [BF10], no complexity analysis (for the model checking problem) is performed, while, in [ALNR10], an upper bound is given for , that matches the one given in [dnp11] for . The algorithm for runs in exponential time in the number of agents, the number of resources, and the digit size of the maximum component occurring in the initial resource availability vector (assuming a binary reppresentation). Analogously, the model checking algorithm for  runs in exponential time in , in the digit size of the maximum component of resource endowment vectors occuring in team operators of and in the number of the agents (this is implicit in set of states of ). Actually, both and are often treated as constant [ALNR10, AHK02] (without this assumption, the complexity of model-checking is shown to be exponential in the number of agents [DBLP:conf/ceemas/JamrogaD05]). However, no complexity lower bound has been exhibit so far. Aim of this paper is to fill this gap, by providing an EXPTIME lower bound for .

3 A logical formalization:

Syntax. We start with the introduction of some notations we will use in the rest of the paper. The set of agents is and a team is any subset of . The integers and will be used throughout the paper to denote the number of agents and resource types (or simply resources), respectively. Let denote the set of global availabilities of resources on the market (or in nature) and let denote the set of money availabilities for the agents, where is the set of natural numbers (zero included). Given a money availability , its -th component is the money availability of agent 222Throughout all the paper, symbols identifying vectors are denoted with an arrow on the top (e.g., , ).. Finally, the set is a finite set of atomic propositions.

The formulae of are given by the following grammar:

where , , , and . Formulae of the kind test the current availability of resources on the market. As usual, other standard operators can be considered as abbreviation, e.g., the operator can be defined as , for every formula .

Priced game structure. Priced game structures are defined by extending the definitions of concurrent game structure and resource-bounded concurrent game structure given in, respectively, [AHK02] and [ALNR10].

Definition 1

A priced game structure is a tuple , where:

  • is the finite set of locations; is called initial location.

  • is the evaluation function, which determines the atomic propositions holding true in each location.

  • is the action function giving the number of actions available to an agent at a location . The actions available to at are identified with the numbers333No ambiguity will arise from the fact that actions of different agents are identified with the same numbers. and a generic action is usually denoted by . We assume that each agent has at least one available action at each location, that could be thought of as the action do-nothing and we assume that it is always the first.

  • is a function that maps each location to the set of vectors . Each vector, called action profile and denoted by , identifies a choice among the actions available for each agent in the location . (The action of the agent in is .)

  • is a partial function, where , with , defines at location the amount of resources required by the ’s action . We define , that is the vector whose components are all equal to , for every , (doing nothing neither consumes nor produces resources).

  • is the transition function. For and , defines the next location reached from if the agents perform the actions in the action profile .

  • is the price function. It returns the price vector of the resources (a price for each resource), based on the current resource availability and location, and on the acting agent.

  • is the initial global availability of resources. It represents the resource availability on the market at the initial state of the system.

Note that a negative value in represents a resource consumption, while a positive one represents a resource production. We also consider the extension of the function , called again with the same name, to get the amount of resources required by a given team. Thus, for a location , a team and an action profile , . Moreover, we will use the function that for the tuple returns the vector of the resources which are consumed by an agent , being in state , for an action . This vector is obtained from by replacing the positive components, representing a resource production, with zeros, and the negative components, representing a resource consumption, with their absolute values.

Example 1

A priced game structure with two agents and and one resource is depicted in Figure 1. The only atomic proposition is , labeling the locations , , . The action profiles, labeling the transitions in the graph and depicted with square brackets, are as follows. is due to the existence of two actions of and one action of at location , corresponds to a single action of and two actions of at location . In all the other locations the only action profile is corresponding to the existence of a single action of both the agents. The function is represented by parentheses. The price vector is not depicted.

, , ,
,
, , ,
,
, ,
, ,
,
Figure 1: Example of priced game structure .

Semantics. In the following, given a resource availability , by we denote the set . In order to give the formal semantics let us first define the following notions.

Definition 2

A configuration of a priced game graph is a pair . Given two configurations and , and an action profile , we say that if and . A computation over is an infinite sequence of configurations of , such that for each there is an action profile such that .

Let be a computation. We denote by the -th configuration in and by , with , the finite sequence of configurations in . Given a configuration and a team , a function is called A-feasible in c if there exists an action profile with for all and . In this case we say that extends .

Definition 3

A strategy of a team is a function which associates to each finite sequence of configurations , a function which is -feasible in .

In other words, a strategy returns a choice of the actions of the agents in the team , considering only those actions whose resource consumption does not exceed the available amount and whose resource production does not exceed the amount consumed so far. Clearly, this constraint will limit both proponent and opponent team.

For each strategy of a team and for each sequence of configurations , there are several possibilities for the next configuration , depending on the different choices of the opponent team . Anyway, fixed a strategy of the opponent team, there is at most one action profile obtained according to both the strategies, that is the action profile extending both , given by the strategy , and , given by the strategy (i.e. is such that , for and ). A computation , is the outcome of the strategies and from the configuration if, for each , there is an action profile obtained according to both and , such that . Given a strategy and a configuration , denotes the set of the outcomes of and from , for all the strategies of the team . Observe that, given a finite sequence of configurations , if the action profile according to the two strategies is not such that , then there is no next configuration. Thus outcome of the strategies and from a given configuration may be undefined (recall that we consider only infinite computations).

Example 2

Consider the priced game structure in Figure 1, with teams and , one resource type and initial global availability . Let be a sequence of configurations (of length ). Team has two possible strategies in , one for each possible action of agent , and team has one strategy for the single available action of agent . Suppose that, according to the strategy , agent chooses to perform the action (), then the action profile is performed and one unit of the unique resource is consumed. In the obtained configuration the agent has one available action while the agent has two actions. Anyway cannot return the action for the agent , since this action would require an amount of the resource greater than , which is the current availability. Thus only the configuration can be reached and the computation is the only one that belongs to .

Now we introduce the concept of consistent strategy. Two properties have to be satisfied: first, the outcomes starting from are always defined and also the agents of the proponent team have enough money to realize the chosen actions.

Definition 4

Let , be a configuration, be the proponent team, and be the opponent team. A strategy of is said to be consistent with respect to and (-strategy), if

  1. for any strategy of , the outcome of and from the configuration is defined,

  2. for every , with , for every and : .

In the above condition the dot operator denotes the usual scalar product of vectors. Observe that only the money availability of the team is tested. Actually, we suppose that the opponent team always has money enough to make its choice. Notice also that the actions producing resources do not cause a reimbursement of money to the agents. As it is usual when dealing with temporal logics, we guarantee that priced game structures are non-blocking, in the sense that at least a -strategy exists for a given team . Indeed, agents of can always jointly choose the do-nothing action.

A formula of is evaluated with respect to a priced game structure and a configuration . The definition of the semantics is completed by the definition of the satisfaction relation :

  • iff

  • iff

  • iff and

  • iff there exists a -strategy such that, for all , it holds that

  • iff there exists a -strategy such that, for all , there exists such that and, for all , it holds that

  • iff there exists a -strategy such that, for all , it holds that for all

  • iff where .

Given a formula and a priced game srtucture , we say that satisfies , , if where . The model checking problem for consists in verifying whether .

Example 3

Consider the priced game structure in Figure 1, with teams and . A formula holds true in the configuration , provided that and are enough to make the move. Indeed, and together are able to force the computation to reach the (one unit of resource is consumed). From such a configuration, the opponent team cannot force the computation into , as the action is not allowed for (no resources are available to perform the action), and thus holds. Instead, is false in the configuration (actually in each configuration , with ), because is reached after the execution of the first transition, and in that configuration action for in is allowed, leading to . Finally, notice that the formula is false also when evaluated in , as the only possible transition is the one leading from to (no resources are available to perform action for agent ).

4 Complexity lower bounds for the model checking problem

In [dnp11], the authors presented an algorithm for model checking , providing an exponential upper bound for the problem. In particular, let be the number of agents, the number of resources, and the maximum component occurring in the initial resource availability vector, the proposed algorithm runs in exponential time in , , and the size of the representation of (assuming that is represented in binary). In this section we prove that an algorithm that behaves asymptotically better cannot exist, thus proving that the problem is EXPTIME-complete. To prove the inherent difficulty with respect to the multiple input parameters, we show two reductions: one parametric in the representation of (the digit size), which assumes both and constant, and the other parametric in , this time assuming constant both and the value of . We conjecture the existence of a third EXPTIME reduction, in which and are constant and the parameter is . In fact, if it was not the case, it would be possible to improve the proposed model checking algorithm in a way that its complexity would not be exponential in .

We first recall the formalism of linearly-bounded alternating Turing machines () and the notion of hierarchical representation, a succinct way of representing priced game structures inspired to the work done in [AY01] for classical Kripke structures. Finally, we present the two reductions from the acceptance problem for , known to be EXPTIME-complete [CKS81], to the model checking problem for .

4.1 Linearly-bounded alternating Turing Machines

A linearly-bounded alternating Turing machines () is a tuple , where is the set of states, partitioned in (universal states) and (existential states); is the set of tape symbols, including the ‘blank’ symbol , and two special symbols and , denoting the left and right tape delimiters; is the instruction set; is the initial state.

Symbols from are stored in the tape cells, and the first and the last cell of the tape store, respectively, the symbols and . A tape configuration is a sequence of the symbols stored in the tape cells, and keeps trace of an head cell. A configuration is a pair of a state and a tape configuration , and is the set of the configurations. The initial configuration is , where contains the input, possibly followed by a sequence of blanks, and its head cell stores the first input symbol.

An instruction is also denoted , where is called a full state. Its intuitive meaning is as follows: “whenever the machine is in the state and the symbol in the head cell is , then the machine switches to state , the symbol in the head cell is replaced with , and the head position is moved to the left or to the right (according to )”. An execution step of the machine is denoted , where , and is the configuration reached from after the execution of the instruction . Let is an execution step, for some . All the tape configurations are linear in the length of the input and we follow the common practice to only consider machines whose tape length does not vary during the computation. We can also assume that have no infinite computations since any can be transformed into another, accepting the same language and haltingin a finite number of steps. Such a counts the number of execution steps and rejects any computation whose number of steps exceeds the number of possible configurations.

The acceptance condition is defined recursively. A configuration is said to be accepting if either one of the following conditions is verified: and is accepting for all or and there exists such that is accepting. Notice that an universal (existential) state always accepts (rejects) if . A accepts on an initial input tape , if the initial configuration is accepting.
Hierarchical representation. In order to exhibit our encoding proposal, we make use of a hierarchical representation analogous to the one described in [AY01, LNPP03, LNPP08] for model checking, and in [MNP2008] for module checking procedures. Given a finite state machine, the idea of hierarchical representation is to replace two or more substructures of the machine that are structurally equivalent, by another (structurally equivalent) module, that is a finite state machine itself. The use of hierarchical representation results in an exponentially more succinct representation of the system, that amounts (in most cases) to more efficient model checking procedures (in the other cases, this does not yield a more efficient behavior, as the analysis requires a flattening of the machine itself, thus incurring in an exponential blow up in its size).

In our context, this idea can be suitably adapted to deal with the presence of resources, as follows. Modules do not represent structurally equivalent substructures, but substructures that have the same impact on the values of resource variables. In principle, whenever the analysis is focused on the evolution of resource variables, it makes sense to consider as equivalent two substructures that can possibly differ in their structure but whose effect on the set of resource variables is exactly the same. This approach could be thought of as a hierarchical representation based on functional equivalence between substructures, as opposed to the classical notion of hierarchical representation based on structural equivalence.

4.2 A reduction from the acceptance problem for

Given an and an input tape configuration , we provide a priced game structure , with two agents and , and a formula such that if and only if accepts on .

In the following, we exhibit the game structure by using a graphical (hierarchical) representation (Figures 2-7 in Appendix). Notice that only significant information is explicitly shown in the pictures. In particular, labels on transitions (arcs) represent consumptions/productions of resources due to the execution of the joint move (proponent and opponent moves) associated to that transition. For example, the label “” on the loop transition of Figure (b)b means that the actions associated to the transition will consume 1 unit of the (type) resource and 10 unit of , and will produce 1 unit of the resource and 10 unit of . Availability of other resources is unchanged, then the relative information is omitted.

The reduction uses the three resource variables , , and to encode the tape configuration, plus three auxiliary resource variables , , and , that will be useful during the construction. Moreover, we associate to the above set of variables the set of counterbalanced variables . The idea behind the use of counterbalanced variables, that is also the key idea of the reduction, consists of designing the game structure in a way that to every consumption (resp., production) of a resource, say for instance , a corresponding production (resp., consumption) of its counterbalanced exists. In particular, this is true inside each module of the hierarchical structure, thus the sum of the availability of a resource variable and its counterbalanced variable is kept constant along all the computation at every module’s entry and exit points, equal to a value , which depends on the input of the . This will allow us to force the execution of specific transitions at specific availabilities of resource variables. Consider, for example, the node of Figure (b)b with 2 outgoing transitions, one of which is a loop transition. The presence of 2 outgoing transitions means that either the proponent or the opponent can choose between 2 moves. But such a freedom is only potential, as in any moment of the computation the choice of the next move by the proponent/opponent is constrained by the resource availability: if the loop transition is enabled, then the availability of the resource is greater than 0, and thus the availability of its counterbalanced variable is less than , that means that the other transition, which consumes units of the resource , is disabled. On the contrary, if the non-loop transition is enabled, there are units of the resource available, and thus the availability of the resource is 0, that means that the loop transition is disabled. Thus, by taking advantage of the features of counterbalanced variables, we are able to force the executions to have a somehow deterministic behavior.

Encoding of the tape. Without loss of generality, we consider on input alphabet , thus is the set . Recall that the symbols , , and denote the ‘blank’ symbol, the left delimiter, and the right delimiter, respectively. Tape symbols are encoded by the digits and , in a pretty natural way: encodes the ‘blank’ symbol, and encode the input symbols and , and and encode the left and right delimiters. The tape configuration is encoded by means of the three resource variables , , and . The value of ranges over the set and encodes the value stored in the cell currently read by the head (according to the above encoding of tape symbols into digits). The value of encodes the tape configuration at the left of the current head position in a forward fashion. The value of encodes the tape configuration at the right of the current head position in a reverse fashion, that is, encodes the reverse of the string corresponding to the tape configuration at the right of current head position. As an example, consider the tape configuration , the symbol read by the head is the underlined one. Such a configuration is encoded by means of the three resource variables as follows: , , and . It can be noticed that the length of the representation of the three variables , , and is proportional to the length of the tape configuration which is at most linear in the size of the input, namely . Using such an encoding, the machine operation “shift the head to the left” can be represented by means of the following operations on resource variables:

  • the new value of is

  • the new value of is ,

  • the new value of is ( is the integer division),

The operation “shift the head to the right” can be encoded analogously.

Notice that in order to encode in polynomial time the operations of shifting the head to left and right, we encode the string to the right of the current head position in a reverse order. Indeed, in this way the symbol stored on the cell immediately to the right of the head corresponds to the least significant digit of , and thus can be accessed by using the module operation ().

Encoding of the instructions. The encoding of the instructions is depicted in Figure 2. Transitions starting from a node labeled represent all the possible instructions matching the full state of the , that is, all the instructions that can be possibly performed at the full state .

More in detail, given a full state of the machine, with , the encoding of the set of matching instructions is shown in Figure (a)a, (recall that . Analogously, the encoding of the set of instructions matching the full state , with , is shown in Figure (b)b. Let us underline that the action profiles labeling transitions corresponding to an existential state are such that the first agent has the capability to force a specific transition (instruction) to be executed, depending on the choice of the for the next action, independently from the choice of the other agent . On the other hand, the action profiles labeling transitions corresponding to an universal state are such that the roles of the agents are exchanged.

write

write

move

move

(a) Full state , with .

write

write

move

move

(b) Full state , with .
Figure 2: Encoding of the set of instructions matching a full state of a .

The representation of Figure 2 is hierarchical and involves the modules write and move. The former encodes the rewriting of the head cell performed by and, to this aim, makes use of one of the following modules (Figure 3), depending on the symbol read by the head, and on the symbol to be written:

  • inc, depicted in Figure (a)a, is used when the rewriting corresponds to an increment, for example, when the symbol has to be written in place of the symbol ;

  • double_inc, depicted in Figure (b)b, is used when the rewriting corresponds to a double increment, for example, when the symbol (encoded as ) has to be written in place of the symbol (encoded as );

  • dec, depicted in Figure (c)c, is used when the rewriting corresponds to a decrement, for example, when the symbol has to be written in place of the symbol ;

  • double_dec, depicted in Figure (d)d, is used when the rewriting corresponds to a double decrement, for example, when the symbol has to be written in place of the symbol .

Obviously, the module does nothing when the symbol to be written corresponds to the symbol currently stored in the head cell.

(a) Module inc.

inc

inc

(b) Module double_inc.

(c) Module dec.

dec

dec

(d) Module double_dec.
Figure 3: Encoding of the module write.

The module move encodes the shift (to right or to left) of the head. It is designed in a way that the only next location that can be reached by the game is consistent with the value stored on the new head cell (after the shift operation). In Figure 4 and 5 the sub-modules encoding the operation “shift to right” are depicted. The encoding of the operation “shift to left” can be realized analogously.

times_10()

add(,)

div_10()

assign(,)

choose_next_state()

(a) Module shift_right.

assign(,)

to_zero()

(b) Module times_10().

to_zero()

to_zero()