# Equilibria in the Tangle

We analyse the Tangle --- a DAG-valued stochastic process where new vertices get attached to the graph at Poissonian times, and the attachment's locations are chosen by means of random walks on that graph. We prove existence of ("almost symmetric") Nash equilibria for the system where a part of players tries to optimize their attachment strategies. Then, we also present simulations that show that the "selfish" players will nevertheless cooperate with the network by choosing attachment strategies that are similar to the default one.

## Authors

• 3 publications
• 4 publications
• 4 publications
• ### Memory-two strategies forming symmetric mutual reinforcement learning equilibrium in repeated prisoner's dilemma game

We investigate symmetric equilibria of mutual reinforcement learning whe...
08/05/2021 ∙ by Masahiko Ueda, et al. ∙ 0

• ### Nash equilibria in games over graphs equipped with a communication mechanism

We study pure Nash equilibria in infinite-duration games on graphs, with...
06/18/2019 ∙ by Patricia Bouyer, et al. ∙ 0

• ### Smale Strategies For The N-Person Iterated Prisoner's Dilemma

Adapting methods introduced by Steven Smale, we describe good strategies...
02/11/2018 ∙ by Ethan Akin, et al. ∙ 0

• ### Games on graphs with a public signal monitoring

We study Nash equilibria in games on graphs with an imperfect monitoring...
10/19/2017 ∙ by Patricia Bouyer, et al. ∙ 0

• ### Nash Equilibria in certain two-choice multi-player games played on the ladder graph

In this article we compute analytically the number of Nash Equilibria (N...
01/22/2021 ∙ by Victoria Sánchez Muñoz, et al. ∙ 0

• ### Coalition Resilient Outcomes in Max k-Cut Games

We investigate strong Nash equilibria in the max k-cut game, where we ar...
10/22/2018 ∙ by Raffaello Carosi, et al. ∙ 0

• ### From Hotelling to Load Balancing: Approximation and the Principle of Minimum Differentiation

Competing firms tend to select similar locations for their stores. This ...
03/11/2019 ∙ by Matthias Feldotto, et al. ∙ 0

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

In this paper we study the Tangle, a stochastic process on the space of (rooted) Directed Acyclic Graphs (DAGs). This process “grows” in time, in the sense that new vertices are attached to the graph according to a Poissonian clock, but no vertices/edges are ever deleted. When that clock rings, a new vertex appears and attaches itself to locations that are chosen with the help of certain random walks on the state of the process in the recent past (this is to model the network propagation delays); these random walks therefore play the key role in the model.

Random walks on random graphs can be thought of as a particular case of Random Walks in Random Environments: here, the transition probabilities are functions of the graph only, i.e., there are no additional variables (such as conductances

111

this refers to the well-known relation between reversible Markov chains and electric networks, see e.g. the classical book

[7]
etc.) attached to the vertices and/or edges of the graph. Still, this subject is very broad, and one can find many related works in the literature. One can mention the internal DLA models (e.g. [13] and references therein), random walks on Erdös-Rényi graphs [5, 13], or random walks on the preferential attachment graphs [4] (which most closely resembles the model of this paper).

The motivation for studying the particular model presented in this paper stems from the fact that it is applied in the IOTA cryptocurrency [1, 19], which uses (nontrivial) DAGs as the primary ledger for the transactions’ data. This is different from “traditional” cryptocurrencies such as the Bitcoin, where that data is stored in a sequence of blocks222that is, the underlying graph is essentially  (after discarding finite forks), also known as blockchain. An important observation, which motivates the use of more general DAGs instead of blockchains is that the latter scale poorly: when the network is large, it is difficult for it to achieve consensus on which blocks are “valid” in the situations when the new blocks come too frequently. We also cite [2, 3, 16, 20] which deal with other approaches to using DAGs as distributed ledgers.

The main results of the present paper deal with the following question: what if some participants of the network are trying to minimize their costs by adopting a behavior different from the “default” one? How will the system behave in such circumstances? To address these kinds of questions, we first provide general arguments to prove existence of (“almost symmetric”) Nash equilibria for the system, see Section 2. Although one can hardly access the explicit form of these equilibria in a purely analytical way, simulations presented in Section 3 show that the “selfish” players will typically still choose attachment strategies that are similar to the default one, meaning that they would prefer cooperating with the network rather than simply using it).

Let us stress also that, in this paper, we consider only “selfish” players (those who only care about their own costs but still want to use the network in a legitimate way333i.e., want to issue valid transactions and have them confirmed by the rest of the network); we do not consider the case when there are “malicious” ones (those who want to disrupt the network even at a cost to themselves). We are going to treat several types of attacks against the network in the subsequent papers.

### 1.1 Description of the model

In the following we introduce the mathematical model describing the Tangle [19].

Let stand for the cardinality of (multi)set . For an oriented multigraph , where  is the set of vertices and  is the multiset of edges, and , we denote by

 degin(v) =card{e=(u1,u2)∈E:u2=v}, degout(v) =card{e=(u1,u2)∈E:u1=v}

the “incoming” and “outgoing” degrees of the vertex  (counting the multiple edges). In the following, we refer to multigraphs simply as graphs. For , we say that  approves , if . We use the notation  for the set of the vertices approved by . We say that  references  if there is a sequence of sites such that for all , i.e., there is a directed path from  to . If (i.e., there are no edges pointing to ), then we say that  is a tip.

Let  be the set of all directed acyclic graphs (also known as DAGs, that is, oriented graphs without cycles) with the following properties:

• the graph  is finite and the multiplicity of any edge is at most two (i.e., there are at most two edges linking the same vertices);

• there is a distinguished vertex such that for all , and (this vertex is called the genesis);

• any such that references ; that is, there is an oriented path444not necessarily unique from  to (one can say that the graph is connected towards ).

We now describe the tangle as a continuous-time Markov process on the space . The state of the tangle at time  is a DAG , where is the set of vertices and is the multiset of directed edges at time . The process’s dynamics are described in the following way:

• The initial state of the process is defined by , .

• The tangle grows with time, that is, and whenever .

• For a fixed parameter , there is a Poisson process of incoming transactions; these transactions then become the vertices of the tangle.

• Each incoming transaction chooses555the precise selection mechanism will be described below two vertices  and  (which, in general, may coincide), and we add the edges and . We say in this case that this new transaction was attached to  and (equivalently, approves and ).

• Specifically, if a new transaction  arrived at time , then , and .

Let us write

 P(t)(x) ={y∈T(t):y is referenced by x}, F(t)(x) ={z∈T(t):z references x}

for the “past” and the “future” with respect to  (at time ). Note that these introduce a partial order structure on the tangle. Observe that, if

is the time moment when

was attached to the tangle, then for all . We also define the cumulative weight of the vertex  at time  by

 H(t)x=1+card(F(t)(x)); (1)

that is, the cumulative weight of  is one (its “own weight”) plus the number of vertices that reference it. Observe that, for any , if approves  then , and the inequality is strict if and only if there are vertices different from  which also approve . Also note that the cumulative weight of any tip is equal to .

There is some data associated to each vertex (transaction), created at the moment when that transaction was attached to the tangle. The precise nature of that data is not relevant for the purposes of this paper, so we assume that it is an element of some (unspecified, but finite) set ; what is important, however, is that there is a natural way to say if the set of vertices is consistent with respect to the data they contain666one may think that the data refers to value transactions between accounts, and consistency means that no account has negative balance as a result, and/or the total balance has not increased. When it is necessary to emphasize that the vertices of contain some data, we consider the marked DAG to be , where  is a function . We define to be the set of all marked DAGs , where .

### 1.2 Attachment strategies

There is one very important detail that has not been explained, namely: how does a newly arrived transaction choose which two vertices in the tangle it will approve, i.e., what is the attachment strategy? Notice that, in principle, it would be good for the whole system if the new transactions always prefer to select tips as attachment places, since this way more transactions would be “confirmed”777we discuss the exact meaning of this later; for now, think that “confirmed” means “referenced by many other transactions”. In any case, it is quite clear that the appropriate choice of the attachment strategy is essential for the correct functioning (whatever this could mean) of the system.

It is also important to comment that the attachment strategy of a network node is something “internal” to it; what others can see, are the attachment choices of the node, but the mechanism behind them need not be publicly known. For this reason, an attachment strategy cannot be imposed in the protocol.

We now describe a possible choice of the attachment strategy, used to determine where the incoming transaction will be attached. It is also known as the recommended tip selection algorithm, since, due to reasons described above, the recommended nodes’ behavior is always to try to approve tips. We stress again, however, that approving only tips is not imposed in the protocol, since there is usually no way to know if a node “knew” if the transaction it approved was already approved by someone else before (also, there is no way to know which approving transaction was the first).

Let us denote by  the set of all vertices that are tips at time , and let . To model the network propagation delays, we introduce a parameter , and assume that at time  only is known to the entity that issued the incoming transaction. We then define the tip-selecting random walk, in the following way. It depends on a parameter  (the backtracking probability) and on a function . The initial state of the random walk is the genesis 888although in practical implementations one may start it in some place closer to the tips, and it is stopped upon hitting the set . It is important to observe that does not necessarily mean that  is still a tip at time . Let be a monotone non-increasing function. The transition probabilities of the walkers are defined in the following way: the walk backtracks (i.e., jumps to a randomly chosen site it approves) with probability ; if  approves , then the transition probability  is proportional to , that is,

 P(f)xy=⎧⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎨⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎩q2, if y∈A(x),(1−q)f(H(t−h)x−H(t−h)y)∑z:x∈A(z)f(H(t−h)x−H(t−h)z), if x∈A(y),0, otherwise (2)

(for we define the transition probabilities as above, but with ). In what follows, we will mostly assume that for some . We use the notation for the transition probabilities in this case. Intuitively, the smaller is the value of , the more random the walk is999physicists would call the case of small  high temperature regime, and the case of large  low temperature regime (that is, stands for the inverse temperature). It is worth observing that the case and corresponds to the GHOST protocol of [21] (more precisely, to the obvious generalization of the GHOST protocol for the case when a tree is substituted by a DAG).

Now, to select two tips  and  where our transaction will be attached, just run two independent random walks as above, and stop when you first hit . One can also require that  should be different from ; for that, one may re-run the second random walk in the case its exit point happened to be the same as that of the first random walk. Observe that is a continuous-time transient Markov process on ; since the state space is quite large, it is difficult to analyse this process. In particular, for a fixed time , it is not easy to study the above random walk since it takes place on a random graph, e.g., can be viewed as a random walk in a random environment; it is common knowledge that random walks in random environments are notoriously hard to deal with.

We say that a transaction is confirmed with confidence  (where is some pre-defined number, close to ), if, with probability at least , the large- random walk101010recall that the large- random walk is “more deterministic” ends in a tip which references that transaction. It may happen that a transaction does not get confirmed (even, possible, does not get approved a single time), and becomes orphaned forever. Let us define the event

 U={every transaction eventually gets approved}.

We believe that the following statement holds true; however, we have only a heuristical argument in its favor, not a rigorous proof. In any case, it is only of theoretical interest, since, as explained below, in practice we will find ourselves in the situation where

. We therefore state it as

###### Conjecture 1.1.

It holds that

 P[U]=⎧⎪ ⎪ ⎪⎨⎪ ⎪ ⎪⎩0,if ∫+∞0f(s)ds<∞,1,if ∫+∞0f(s)ds=∞. (3)
###### Explanation..

First of all, it should be true that since  is a tail event with respect to the natural filtration; however, it does not seem to be very easy to prove the 0–1 law in this context (recall that we are dealing with a transient Markov process on an infinite state space). Next, consider a tip  which got attached to the tangle at time , and assume that it is still a tip at time ; also, assume that, among all tips, is “closest” (in some suitable sense) to the genesis.

Let us now think of the following question: what is the probability that  will still be a tip at time ?

Look at Figure 1: during the time interval , new particles will arrive, and the corresponding walks will travel from the genesis  looking for tips. Each of these walks will have to cross the dotted vertical segment on the picture, and with positive probability at least one of them will pass through , one of the vertices approved by . Assume that  was already confirmed (i.e., connected to the right end of the tangle via some other transaction  that approves ). Then, it is clear (but not easy to prove!) that the cumulative weight of both  and  should be , and so, when in , the walk will jump to the tip  with probability .

This suggests that the probability that  (i.e., that  still is tip at time ) is , and the Borel-Cantelli lemma111111to be precise, a bit more refined argument is needed since the corresponding events are not independent gives that the probability that  will be eventually approved is less than  or equal to  depending on whether converges or diverges; the convergence (divergence) of the sum is equivalent to convergence (divergence) of the integral in (3) due to the monotonicity of the function . A standard probabilistic argument121212which is also not so easy to formalize in these circumstances would then imply that if the probability that a given tip remains orphaned forever is uniformly positive, then the probability that at least one tip remains orphaned forever is equal to . ∎

One may naturally think that it would be better to choose the function  in such a way that, almost surely, every tip eventually gets confirmed. However, as explained in Section 4.1 of [19], there is a good reason to choose a rapidly decreasing function , because this defends the system against nodes’ misbehavior and attacks. The idea is then to assume that a transaction which did not get confirmed during a sufficiently long period of time is “unlucky”, and needs to be reattached131313in fact, the nodes of the network may adopt a rule that instructs to delete the transactions that are older than  and still are tips from their databases to the tangle. Let us fix some : it stands for the time when an unlucky transaction is reissued (because there is already very little hope that it would be confirmed “naturally”). We call a transaction issued less than  time units ago “unconfirmed”, and if a transaction was issued more than  time units ago and was not confirmed, we call it “orphaned”. In the following, we assume that the system is stable, in the sense that the “recent” unconfirmed transactions do not accumulate and the time until a transaction is confirmed (roughly) does not depend on the moment when it appeared in the system141414simulations indicate that this is indeed the case when  is small; however, it is not guaranteed to happen for large values of .

In that stable regime, let  be the probability that a transaction is confirmed

time units after it was issued for the first time; the number of times a transaction should be issued to achieve confirmation is then a Geometric random variable with parameter

(and, therefore, with expected value ); so, the mean time until the transaction is confirmed is . Let us then recall the following remarkable fact belonging to the queuing theory, known as the Little’s formula (sometimes also referred to as the Little’s theorem or the Little’s identity):

###### Proposition 1.2.

Suppose that is the arrival rate, is the mean number of customers in the system, and  is the mean time a customer spends in the system. Then .

###### Proof.

See e.g. Section 5.2 of [6]. To understand intuitively why this fact holds true, one may reason in the following way: assume that, while in the system, each customer pays money to the system with rate . Then, at large time , the total amount of money earned by the system would be (approximately)  on one hand, and  on the other hand. Dividing by  and then sending  to infinity, we obtain . ∎

Little’s formula then implies151515in the language of queuing systems, a reissued transaction is a customer which goes back to the server after an unsuccessful service attempt the following

###### Proposition 1.3.

The average number of unconfirmed transactions in the system is equal to .

###### Proof.

Indeed, apply Proposition 1.2 with (think of a transaction which was reattached as a customer which returns to the server after an insuccessful service attempt; this way, the incoming flow of customers still has rate ). As observed before, the mean time spent by a customer in the system is equal to . ∎

In case when the tangle contains data (which, in principle, can make transactions incompatible between each other), one may choose more sophisticated methods of tip selection. As we already mentioned, selecting tips with larger values of  provides better defense against attacks and misbehavior; however, smaller values of  make the system more stable with respect to the transactions’ confirmation times. An example of “mixed-” strategy is the following. Define the “model tip” as a result of the random walk with large , then select two tips  and  with random walks with small , but check that

 P(t−h)(w0)∪P(t−h)(w1)∪P(t−h)(w2)

is consistent.

## 2 Selfish nodes and Nash equilibria

Now, we are going to study the situation when some participants of the network are “selfish” and want to use a customized attachment strategy, in order to improve the confirmation time of their transactions (possibly at the expense of the others).

For a finite set let us denote by the set of all probability measures on , that is

 M(A)={μ:A→R such that μ(a)≥0 for all a∈A and ∑a∈Aμ(a)=1}.

Let

 M=⋃G=(V,E)∈GM(V×V)

be the union of the sets of all probability measures on the pairs of (not necessarily distinct) vertices of DAGs belonging to . Then, an attachment strategy is a map

 S:G[d]→M

with the property for any ; that is, for any with data attached to the vertices (which corresponds to the state of the tangle at a given time) there is a corresponding probability measure on the set of pairs of the vertices. Note also that in the above we considered ordered pairs of vertices, which, of course, does not restrict the generality.

Let be a fixed number. We now assume that, for a (very) large , there are  nodes that follow the default tip selection algorithm, and  “selfish” nodes that try to minimize their “cost”, whatever this could mean161616for example, the cost may be the expected confirmation time of a transaction (conditioned that it is eventually confirmed), the probability that it was not approved during certain (fixed) time interval, etc.. Assume that all nodes issue transactions with the same rate , independently. The overall rate of “honest” transactions in the system is then equal to , and the overall rate of transactions issued by selfish nodes equals .

Let be the attachment strategies used by the selfish nodes. To evaluate the “goodness” of a strategy, one has to choose and then optimize some suitable observable (that stands for the “cost”); as usual, there are several “reasonable” ways to do this. We decided to choose the following one, for definiteness and also for technical reasons (to guarantee the continuity of some function used below); one can probably extend our arguments to other reasonable cost functions. Assume that a transaction  was attached to the tangle at time , so for all . Fix some (typically large) . Let be the moments when the subsequent (after ) transactions were attached to the tangle. For let  be the event that the default tip-selecting walk171717i.e., the one used by nodes following the default attachment strategy on  stops in a tip that does not reference . We then define

 W(v)=1R(v)1+⋯+1R(v)M0 (4)

to be the number of times that the  “subsequent” tip selection random walks do not reference  (in the above, is the indicator function of an event ). Intuitively, the smaller is the value of , the bigger is the chance that  is quickly confirmed.

Next, assume that are the transactions issued by the th (selfish) node. We define

 C(k)(S1,…,SN)=M−10limn→∞W(v(k)1)+⋯+W(v(k)n)n, (5)

to be the mean cost of the th node given that are the attachment strategies of the selfish nodes.

###### Definition 2.1.

We say that a set of strategies is a Nash equilibrium if

 C(k)(S1,…,Sk−1,Sk,Sk+1,…,SN)≤C(k)(S1,…,Sk−1,S′,Sk+1,…,SN)

for any  and any .

Observe that, since the nodes are indistinguishable, the fact that is a Nash equilibrium implies that so is for any permutation .

Naturally, we would like to prove that Nash equilibria exist. Unfortunately, we could not obtain the proof of this fact in the general case, since the space of all possible strategies is huge. Therefore, we consider the following simplifying assumption (which is, by the way, also quite reasonable since, in practice, one would hardly use the genesis as the starting vertex for the random walks due to runtime issues):

Assumption L. There is such that the attachment strategies of all nodes (including those that use the default attachment strategy) only depend on the restriction of the tangle to the last  transactions that they see.

Observe that, under the above assumption, the set of all such strategies can be thought of as a compact convex subset of , where is sufficiently large. Additionally, we assume that the actual set of strategies that can be used by the nodes may be further restricted to some compact convex subset of the above subset. Also, observe that the set of all possible restrictions of elements of  on a subset of  vertices is finite; we denote that set by . The set of all such attachments strategies will be then denoted by .

In this section we use a different approach to model the network propagation delays: instead of assuming that an incoming transaction does not have information about the state of the tangle during last  units of time, we rather assume that it does not have information about the last  transactions attached to the tangle, where  is some fixed positive number (so, effectively, the strategies would depend on subgraphs induced by transactions, although the results of this section do not rely on this assumption). Clearly, these two approaches are quite similar in spirit; however, the second one permits us to avoid certain technical difficulties related to randomness of the number of unseen transactions in the first case181818also, it will be more natural and convenient to pass from continuous to discrete time.

From now on, we assume that vertices contain no data, i.e., the set  is empty; this is not absolutely necessary because, with the data, the proof will be essentially the same; however, the notations would become much more cumbersome. Also, there will be no reattachments; again, this would unnecessarily complicate the proofs (one would have to work with decorated Poisson processes). In fact, we are dealing with a so-called random-turn game here (see e.g. Chapter 9 of [15] for other examples).

To proceed, we need the following

###### Lemma 2.2.

Let  be the transition matrix of an irreducible and aperiodic discrete-time Markov chain on a finite state space . Let  be a continuous map from a compact set  to the set of all stochastic matrices on  (equipped by the distance inherited from the usual matrix norm on the space of all matrices on ). Fix , denote , and let  be the (unique) stationary measure of . Then, the map is also continuous.

###### Proof.

In the following we give a (rather) probabilistic proof of this fact via the Kac’s lemma, although, of course, a purely analytic proof is also possible. Irreducibility and aperiodicity of  imply that, for some and

 Pm0xy≥ε0 (6)

for all . Now, (6) implies that

 ˜Pm0xy(s)≥θm0ε0 (7)

for all and all .

Being  a stochastic process on , let us define

 τ(x)=min{k≥1:Xk=x}

(with the convention ) to be the hitting time of the site  by the stochastic process . Now, let and  be the probability and the expectation with respect to the Markov chain with transition matrix  starting from . We now recall the Kac’s lemma (cf. e.g. Theorem 1.22 of [8]): for all  it holds that

 πs(x)=1E(s)xτ(x). (8)

Now, (7) readily implies that, for all and ,

 P(s)x[τ(x)≥n]≤c1e−c2n (9)

for some positive constants  which do not depend on . This in its turn implies that the series

 E(s)xτ(x)=∞∑n=1P(s)x[τ(x)≥n]

converges uniformly in  and so is uniformly bounded from above191919and, of course, it is also bounded from below by ; also, the Uniform Limit Theorem implies that is continuous in . Therefore, for any , (8) implies that is also a continuous function of . ∎

Consider, for the moment, the situation when all nodes use the same (default) attachment strategy (i.e., there are no selfish nodes). The restriction of the tangle on the last  transactions then becomes a Markov chain on the state space . We now make the following technical assumption on that Markov chain:

Assumption D. The above Markov chain is irreducible and aperiodic.

It is important to observe that Assumption D is not guaranteed to hold for every natural attachment strategy; however, still, this is not a very restrictive assumption in practice because every finite Markov chain may be turned into an irreducible and aperiodic one by an arbitrarily small perturbation of the transition matrix.

Then, we are able to prove the following

###### Theorem 2.3.

Under Assumptions L and D, the system has at least one Nash equilibrium.

###### Proof.

The authors were unable to find a result available in the literature that implies Theorem 2.3 directly; nevertheless, its proof is quite standard and essentially follows Nash’s original paper [17] (see also [10]). There is only one technical difficulty, which we intend to address via the above preparatory steps: one needs to prove the continuity of the cost function.

Denote by

the invariant measure of the Markov chain given that the (selfish) nodes use the “strategy vector

. Then, the idea is to use Lemma 2.2 with , the transition matrix obtained from the default attachment strategy, and  is the transition matrix obtained from the strategy (observe that nodes using the strategies , is the same as one node with strategy  issuing transactions  times faster). Assumption D together with Lemma 2.2 then imply that  is a continuous function of .

Let be the expectation with respect to the following procedure: take the “starting” graph according to , then attach to it a transaction according to the strategy , and then keep attaching subsequent transactions according to the strategy  (instead of and  we may also use the strategy vectors; and  would be then their averages). Let also  be the random variable defined as in (4) for a transaction issued by the th node. Then, the Ergodic Theorem for Markov chains (see e.g. Theorem 1.23 of [8]) implies that

 C(k)(S)=ESk,S′πS′W(k). (10)

It is not difficult to see that the above expression is a polynomial of the ’s coefficients (i.e., the corresponding probabilities) and -values, and hence it is a continuous function on the space of strategies . Using this, the rest of the proof is standard, it is obtained as a consequence of the Kakutani’s fixed point theorem [14] (also with the help of the Berge’s Maximum Theorem, see e.g. Chapter E.3 of [18]). ∎

Symmetric games do not always have symmetric Nash equilibria, as shown in [9]. Also, even when such equilibria exist in the class of mixed strategies, they may be “inferior” to asymmetric pure equilibria; for example, this happens in the classical “Battle of the sexes” game (see e.g. Section 7.2 of [15]).

Now, the goal is to prove that, if the number of selfish nodes  is large, then for any equilibrium state the costs of distinct nodes cannot be much different. Namely, we have the following

###### Theorem 2.4.

For any  there exists  (depending on the default attachment strategy) such that, for all and any Nash equilibrium it holds that

 ∣∣C(k)(S1,…,SN)−C(j)(S1,…,SN)∣∣<ε (11)

for all .

###### Proof.

Without restricting generality we may assume that

 C(1)(S1,…,SN) =maxk=1,…,NC(k)(S1,…,SN), C(2)(S1,…,SN) =mink=1,…,NC(k)(S1,…,SN),

so we then need to proof that , where . Now, the main idea of the proof is the following: if is considerably larger than , then the owner of the first node may decide to adopt the strategy used by the second one. This would not necessarily decrease his costs to the (former) costs of the second node since a change in an individual strategy leads to changes in all costs; however, when  is large, the effects of changing the strategy of only one node would be small, and (if the difference of and were not small) this would lead to a contradiction to the assumption that  was a Nash equilibrium.

So, let us denote , the strategy vector after the first node adopted the strategy of its “more successful” friend. Let

be the two “averaged” strategies. In the following, we are going to compare (the “old” cost of the second node) with (the “new” cost of the first node, after it adopted the second node’s strategy). We need the following

###### Lemma 2.5.

For any measure  on  and any strategy vectors and such that for all , we have

 ∣∣E(Sj,S)πW(j)−E(S′j,S′)πW(j)∣∣≤M0N (12)

for all .

###### Proof.

Let us define the event

 A={among the M0 transactions there is at least one issued by the first node},

and observe that, by the union bound, the probability that it occurs is at most . Then, using the fact that (since, on , the first node does not “contribute” to ), write

 ∣∣E(Sj,S)πW(j)−E(S′j,S′)πW(j)∣∣ =∣∣E(Sj,S)π(W(j)1A)+E(Sj,S)π(W(j)1Ac)−E(Sj,S′)π(W(j)1A)−E(Sj,S′)π(W(j)1Ac)∣∣ =∣∣E(Sj,S)π(W(j)1A)−E(Sj,S′)π(W(j)1A)∣∣ ≤M0N,

where we also used that . This concludes the proof of Lemma 2.5. ∎

We continue proving Theorem 2.4. First, by symmetry, we have

 E(S2,S′)πS′W(1)=E(S2,S′)πS′W(2). (13)

Also, it holds that

 (14)

by Lemma 2.5. Then, similarly to the proof of Theorem 2.3, we can obtain that the function

 (S,S′,S′′)↦E(S,S′)πS′′W(2)

is continuous; since it is defined on a compact, it is also uniformly continuous. That is, for any  there exist  such that if , then

 ∣∣E(S,S′)πS′′W(2)−E(~S,~S′)π~S′′W(2)∣∣<ε′.

Choose . We then obtain from the above that

 ∣∣E(S2,S)πS′W(2)−E(S2,S)πSW(2)∣∣<ε′. (15)

The relations (13), (14), and (15) imply that

On the other hand, since we assumed that  is a Nash equilibrium, it holds that

 E(S2,S′)πS′W(1)=C(1)(s′)≥C(1)(s)=E(S1,S)πSW(1), (16)

which implies that

 E(S1,S)πSW(1)−E(S2,S)πSW(2)≤ε′+M0N.

This concludes the proof of Theorem 2.4. ∎

Now, let us define the notion of approximate Nash equilibrium:

###### Definition 2.6.

For a fixed , we say that a set of strategies is an -equilibrium if

 C(k)(S1,…,Sk−1,Sk,Sk+1,…,SN)≤C(k)(S1,…,Sk−1,S′,Sk+1,…,SN)+ε

for any  and any .

The motivation for introducing this notion is that, if  is very small, then, in practice, -equilibria are essentially indistinguishable from the “true” Nash equilibria.

###### Theorem 2.7.

For any  there exists  (depending on the default attachment strategy) such that, for all and any Nash equilibrium it holds that is an -equilibrium, where

 S=1NN∑k=1S(k) (17)

(that is, all selfish nodes use the same “averaged” strategy defined above). The costs of all selfish nodes are then equal to

 1NN∑k=1C(k)(S1,…,SN),

that is, the average cost in the Nash equilibrium.

In other words, for large  one can essentially assume that all selfish nodes follow the same attachment strategy.

###### Proof.

To begin, we observe that the proof of the second part is immediate, since, as already noted before, for an external observer, the situation where there are  nodes with strategies is indistinguishable from the situation with one node with averaged strategy.

Now, we need to prove that, for any fixed  it holds that

 C(1)(S,…,S)≤C(1)(~S,S,…,S)+ε′ (18)

for all large enough  (the claim would then follow by symmetry). Recall that we have

 C(1)(S,…,S) =E(S,S)πSW(1), (19) C(1)(S1,…,SN) =E(S1,S)πSW(1), (20) and C(1)(~S,S,…,S) =E(~S,S′)πS′W(1), (21)

where

 S′=1N(~S+(N−1)S)=1N(~S+N−1N(S1+⋯+SN)).

Now, the second part of this theorem together with Theorem 2.4 imply202020note that Theorem 2.4 implies that, when  is large, the nodes already have “almost” the same cost in the Nash equilibrium that, for any fixed

 ∣∣E(S,S)πSW(1)−E(S1,S)πSW(1)∣∣<ε (22)

for all large enough .

Next, let us denote

 S′′=1N(~S+S2+⋯+SN).

Then, again using the uniform continuity argument (as in the proof of Theorem 2.4), we obtain that, for any

 ∣∣E(~S,S′)πS′W(1)−E(~S,S′′)πS′′W(1)∣∣<ε′′ (23)

for all large enough . However,

 E(~S,S′′)πS′′W(1)=C(1)(~S,S2,…,SN)≥C(1)(S1,S2,…,SN)=E(S1,S)πSW(1),

since is a Nash equilibrium. Then, (22)–(23) imply that

 ∣∣E(S,S)πSW(1)−E(~S,S′)πS′W(1)∣∣<ε+ε′′,

and, recalling (19) and (21), we conclude the proof of Theorem 2.7. ∎

## 3 Simulations

In this section we investigate Nash equilibria between selfish nodes via simulations. This is motivated by the following important question: since the choice of an attachment strategy is not enforced, there may indeed be nodes which would prefer to “optimise” their strategies in order to decrease the mean confirmation time of their transactions. So, can this lead to a situation where the corresponding Nash equilibrium is “bad for everybody”, effectively leading to the system’s malfunctioning (again, we do not specify the exact meaning of that)?

Due to Theorem 2.7 we may assume that all selfish nodes use the same attachment strategy. Even then, it is probably unfeasible to calculate that strategy exactly; instead, we resort to simulations, which indeed will show that the equilibrium strategy of the selfish nodes will not be much different from the (suitably chosen) default strategy. But, before doing that, let us explain the intuition behind this fact. Naively, a natural strategy for a selfish node would be the following:

• Calculate the exit distribution of the tip-selecting random walk.

• Find the two tips where this distribution attains its “best”212121i.e., the maximum and the second-to-maximum values.

• Approve these two tips.

However, this strategy fails when other selfish nodes are present. To understand this, look at Figure 2: many selfish nodes attach their transactions to the two “best” tips. As a result, the “neighborhood” of these two tips becomes “overcrowded”: there is so much competition between the transactions issued by the selfish nodes, that the chances of them being approved soon actually decrease222222the “new” best tips are not among them, as shown on Figure 2 on the right.

To illustrate this fact, several simulations have been done. All the results depicted here were generated using (2) as the transition probabilities, with , and a network delay of second. Also, a transaction will be reattached if the two following criteria are met:

• the transaction is older than 20 seconds232323even though this is the first mention to a time variable in this paper, the simulation compares actual times at this place;

• the transaction is not referenced by the tip selected by a random walk with 242424Here, when the random walk must choose among transactions with the same weight, it will choose randomly, with equal probabilities.

This way, we guarantee not only that the unconfirmed transactions will be eventually confirmed, but also that all transactions that were never reattached are referenced by most of the tips. Note that when the reattachment is allowed in the simulations, if a new transaction references an old, already reattached transaction together with its newly reissued counterpart, there will be a double spending. Even though the odds of that are low (since when a transaction is re-emitted, it will be old enough to be almost never chosen by the random walk algorithm), a specific procedure was included in the simulations in order to not allow double spendings.

Figure 3 depicts the typical cumulative distribution of the time of the first approval, for and . Note that roughly 95 of the transactions will be approved before s, and almost the totality of transactions will be approved before s. That behaviour will be similar for all studied parameters. The average cost defined in equations (5) and (4) will have a certain meaning, depending on the choice of . This average cost will be related to the average time of approval of a transaction (indeed, the average time will be approximately ). So, in both cases ( and ), the mean cost was calculated over the transactions attached at a interval of time of approximately 10s ( for and for ), what makes this cost something reasonable to optimise.

### 3.1 One dimensional Nash equilibria

In this section, we will study the Nash equilibria of the tangle problem, considering the following strategy subspace:

 Si=S=(1−θ)S0+θS1for each i=1,…,N,

where  is the default tip selection strategy, is the selfish strategy defined in the beginning of this section and . The goal is to find the Nash equilibria relative to the costs defined in the last section (equations (5) and (4)). The selfish nodes will try to optimise their transaction cost with respect to .

Now, suppose that we have a fixed fraction  of selfish nodes, that chooses a strategy among the possible . The non-selfish nodes will not be able to choose their strategy, so they will be restricted, as expected, to . Note that, since they can not choose their strategy, they will not “play” the game. Since the costs are linear over , such mixed strategy game will be equivalent to a second game where only a fraction of the nodes chooses  over , and the rest of the nodes chooses  over .

Figure 4(a) represents a typical graph of average costs of transactions issued under  and , as a function of the fraction , for a low  and two different values of . As already demonstrated, when in equilibrium, the selfish nodes should issue transactions with the same average cost. That means that the system should reach equilibrium in one of the following states:

• some selfish nodes choose  and the rest choose ;

• all selfish nodes choose ;

• all selfish nodes choose .

If the two curves on the graphs do not intersect, the equilibrium should be clearly at state (2) or (3), depending on which of the average costs is larger. If the two curves on the graphs intercept each other, we will also have the intersection point as a Nash equilibrium candidate. We call  the vector of strategies on equilibrium and  the fraction of nodes that will issue transactions under  when the system is in . We define and , meaning that  and  will be deviations from , that result from one node switching strategies, from  to  and from  to , respectively. We also define  and  as strategy vectors related to  and . Note on Figure 5 that this kind of Nash equilibrium candidate may not be a real equilibrium. In the first example (5(a)), when the system is at point  and a node switches strategies, from  to , the cost actually decreases, so  cannot be a Nash equilibrium. On the other hand, the second example (5(b)) shows a Nash equilibrium at point , since deviations to  and  will increase costs.

Now, let us re-examine Figure 4(a). Here, the Nash equilibrium will occur at the point , since we have a situation as on Figure 5(b). That point is easily found at Figure 4(b), when . Note that the Nash equilibrium for a larger  will be at a smaller  than the Nash equilibrium for a smaller . This was already expected, since, for a larger , the tips will be naturally more “overcrowded”, so the effect depicted at Figure 2 will be amplified. Thus, the Nash equilibrium for the higher  cases must occur with a smaller proportion of transactions issued with the pure strategy .

Reconsider now the mixed strategy game. In the case when all the nodes are allowed to choose between the two pure strategies ( and