I Introduction
Bitcoin has gained tremendous concerns as the first fully decentralized cryptocurrency since its advent in 2008. All historical transactions between Bitcoin clients are recorded in a global and public data structure known as the blockchain. The security of the blockchain is established by a chain of cryptographic Hash puzzles, addressed by a largescale network of pseudonymous participants called miners [1]. Solving a Hash puzzle is deemed as a way to generate ProofofWork (PoW) of reaching global consensus. The PoW of Bitcoin demands intensive computations, thus consuming a lot of energy. Each miner competes for this “game”, and is rewarded by cryptocurrencies (i.e. bitcoins) if he is the first acknowledged miner to find a valid block. When the population of miners is large, the aggregate Hash power is sufficiently high such that a malicious miner can hardly accumulate enough Hash power to perform Sybil attacks. The PoW consensus of Bitcoin has been employed in almost % of public blockchains, serving as the cornerstone of current cryptocurrencies.
The security of PoW is challenged by the trend of centralization of Hash power. Mining a Bitcoin block is random and it needs more than 10 years on average with a latestgeneration ASIC chip. Therefore, blockchain miners operate strategically to form pools that have a much larger chance of solving puzzles in each round. By splitting the mining reward appropriately, they acquire a stable income rate. As a side effect, a small number of mining pools occupy a vast majority of global Hash power, placing blockchain systems at the risk of being overthrown by a gigantic pool or colluding pools. The conventional wisdom believes that PoW is secure as long as no miner controls 51% of total Hash power. However, a miner can choose a selfish mining scheme instead of conforming to the standard Bitcoin protocol. Eyal and Sirer pointed out that the selfish mining is profitable (i.e. more rewards than the honest mining) if the Hash power of a miner is larger than 25% [2]
. A more intelligent selfish miner using Markov Decision Process (MDP) can lower down this threshold to around 23.21%
[8]. Note that the both studies assume the existence of a single selfish miner while multiple (colluding) pools might be close to this profitable threshold.In this paper, we study a fundamental question regarding the blockchain security: Will selfish mining become more easily profitable when there exist more than one selfish miners, and how many rounds should a selfish miner wait until being profitable? The former subquestion aims to unravel whether each selfish miners needs a smaller threshold of Hashrate to gain more rewards than mining honestly. The latter pays attention to the transient behavior in the process of selfish mining that takes into account the mining difficulty adjustment. The transient analysis is also crucial for a selfish miner is inclined to waiting for a long period to gain more rewards, especially when the global Hashrate increases rapidly. We establish the selfish mining model for an honest pool that represents all honest miners, and two selfish mining pools who are not aware of each other’s misbehaving role. By dissecting all the possible events that trigger the change of private and public chains, we formulate a set of Markov chains to capture all the state transitions. In contrast to a very recent experimental study[3] that analyzes the profitable threshold of selfish mining with two miners, our work presents a mathematical model that yields closeform expression of such a threshold. In the transient analysis, the selfish mining is found of wasting computing power and thus is definitely unworthy without the subsequent difficulty adjustment of puzzlesolving.
The major contributions and observations are summarized as below.

We establish a set of Markov chain models to characterize the state transition of public and private chains in selfish mining and compute the steady state distributions.

The minimum threshold of Hashrate is symmetric around 21.48% if two selfish miners are both profitable. While the profitable selfish mining becomes more difficult when one of the selfish miner increases his Hashrate, arousing a more furious competition.

The selfish mining is profitable after 51 rounds of difficulty adjustment (i.e. 714 days in Bitcoin) if the Hashrates of selfish miners are both 22% (slightly higher than the profitably threshold). This delay decreases to 5 rounds (i.e. 70 days in Bitcoin) as their Hashrates accrues to 33%, which is still very long.
Ii System Model
In this section, we describe the basic model of blockchain mining in the presence of two adversarial pools.
Iia System Description
Consider a blockchain mining system with two misbehaving mining pools Alice and Bob, as well as an honest mining pool, Henry^{1}^{1}1Multiple honest miners can be boiled down to a single miner for the sake of their linear additivity of Hashrates.
. They compete to solve cryptographic puzzles to mine a valid block for the purpose of acquiring bitcoinlike rewards. The proofofwork (PoW) consensus is adopted and the mining of blocks is stateless: the probability of discovering a block by a miner is proportional to his current Hashrate, but inversely proportional to the current aggregate Hashrate of the entire blockchain network. The blockchain system dynamically adjusts the difficulty of cryptographic puzzles such that new blocks are generated at a fixed average rate(e.g. one block per 10 minutes on average in Bitcoin). We define a “round” as the time to process one attack. The miners maintain a globallyagreed ordered set of transactions via the adoption and the mining on the longest chain. The revenue of a miner is the expected fraction of blocks mined by him out of all the blocks in the longest chain
For the simplicity, we make the following assumptions that are consistent with the literature [2].

The total Hashrate of the blockchain system is normalized as a unit. Then, the Hashrate of a mining pool is represented as a fraction of the total.

The block discovery time by a mining pool is exponentially distributed when his Hashrate is large.

The reward of each valid block is normalized as one cryptographic coin.
Denote by , and the Hashrates of Alice, Bob and Henry respectively, i.e. . Denote by (resp. ) the probability that honest miners mine after Alice’s (resp. Bob’s) released chain in the tiebreaking between Alice (resp. Bob) and Hence. Denote by and the probabilities that honest moners choose to mine after Alice’s and Bob’s chains in the threeparty tiebreaking, respectively. When the blockchain system creates a new block, it is mined by pool with the probability , , owing to the memorylessness of exponentially distributed mining intervals.
Alice (resp. Bob) may release her blocks strategically by forcing Henry into wasting his computations. When Alice and Bob are both selfish miners, the interaction between two private chains becomes more complicated because none of them know other’s behaviour. In what follows, we capture all the different states that each miner may encounter.
IiB Selfish Mining Mode
Alice maintains a private chain, so does Bob, while Henry operates on the public chain. Alice and Bob are not aware of each other’s role. We suppose that all the miners work on the same public chain in the beginning where the starting point is expressed as “0”. The length of the private chain is kept as a private information by Alice and Bob, and the length of the public chain is observed by all of them. We consider the selfish mining method proposed by [2], and our analytical approach can be generalized to a variety of other methods.
The mining procedure consists of two cases as follows.

(Publicchain mining case) Henry always mines after the public chain. Alice or Bob also mines on the public chain if it is longer than his private chain.

(Privatechain mining case) Alice (resp. Bob) continues to mine on her (resp. his) private chain if she (resp. he) discovers a new block and the private chain is now longer than the public chain.
The release procedure is more complicated than the mining procedure. Henry broadcasts his mined block as soon as it is discovered, while Alice and Bob will decide whether to release their mined blocks depending on the length of the public chain.

(Forfeit case) Alice (resp. Bob) abandons her (resp. his) private chain and conforms to mining after the public chain if the latter is longer. Henry also abandons his public chain if Alice or Bob publishes a longer chain.

(Riskavoiding release case) Alice (resp. Bob) releases her (resp. his) privately mined blocks to the public because of the fear of loss if the new block is mined by the others and the leading advantage of her private chain is no more than two blocks.

(Chain reaction case) When Alice (resp. Bob) releases her (resp. his) blocks to the public chain and updates its length, the release of Bob’s (resp. Alice’s) private blocks is triggered immediately.
The chain reaction case is the combination of the forfeit and the riskavoiding cases, whereas the existence of chain reaction complicates evolution of the public chain. Suppose that Alice publishes her private blocks to obsolete the current public chain. After the construction of new public chain, Bob may release his private chain to forfeit it immediately.
IiC Release procedure and tiebreaking Logics
The consensus on the public chain requires that it is the longest. A crucial question is how the public chain evolves when it is of the same length as Alice or Bob. In general, each miner works on his own chain, and the release behavior of Alice and Bob is triggered when Henry mines a new block. We hereby illustrate the evolution of private and public chains where , , and denote that the blocks belong to Alice, Bob and Henry respectively. The blocks of private chains are in grey and those of public chains are in white.
Riskavoiding release case We show the riskavoiding release of Alice’s private chain in Figure 1. Alice is only one block ahead of Henry after the latter mines a new block for the public chain. Because Alice fears of losing the competition, she publishes her private blocks, obseleting Henry’s public chain, so that both Alice and Henry mine on the new longest chain afterwards.
Tiebreaking resolvings. If Alice’s private chain is only one block ahead of Henry’s, Henry may catch up with her. When it happens, Alice publishes her private blocks immediately to compete with Henry. Thus, two public chains of the same length exist in Figure 2. Since only one public chains prevails, a tiebreaking rule needs to be taken into account. The first case is that the public chains of Alice and Henry have the same length, and Bob’s private chain is either 0 or very long. Hence, we only need to resolve the tie between Alice and Henry. All the miners are possible to mine after block , while Bob and Henry may mine after . There are five possibilities of extending the longest public chain, and the shorter one will be obsoleted. We omit the tiebreaking between Bob and Henry because this can be analyzed in the same way.
For the situation that each of Alice and Bob hides one private block, they will publish their private chains instantly after Henry finds a new block. As shown in Figure 3, there exists three competing public chains. Alice will mine after and Bob will mine after for sure; Henry is not aware of which chain is maliciously forked so that he may mine on each public chains. There are also five possible situations. The riskavoiding release, together with two tiebreaking solutions, constitutes all the dynamics of private and public chains.
Chain reaction release. We next introduce the chain reaction release that complicates the evolution of the private and public chains. Note that the chain reaction release consists of a sequence of riskavoiding releases and tiebreaking resolvings. Figure 4 illustrates an example on how the chain reaction phenomenon is triggered. At stage 1, Alice’s private chain contains four blocks while the lengths of Bob’s private chain and Henry’s public chain are 0. After a tiebreaking resolving at stage 2, the longer public chain contains two blocks and , and the shorter is orphaned. Bob construct a new private chain starting from to , while Henry continues to mine one block after at stage 4. From Alice’s perspective, her private chain is merely one block ahead of the public chain. She releases her private blocks in order to avoid the risk of losing the race with Henry. The new public chain now starts from block . Next, stage 5 and 6 constitute a new round of tiebreaking resolving between Alice and Henry, extending the public chain to block . However, the release of triggers Bob to release all of his private blocks starting from to . When retrospecting all the mining stages, we observe that the winning branch switches back and forth, making the analysis of selfish mining extremely complicated.
Iii Finite State Machine
In this section, we construct the state machine of blockchain selfish mining and present the steadystate and transient analysis of the profitable threshold.
Iiia Steadystate Analysis
We hereby formulate a finite state machine to characterize the evolution of private and public chains. Figure 5 illustrates the state machine when the maximum length of private chain is two (i.e. ). We define the state as a threetuple consisting of the lengths of Alice, Bob and Henry. The arrows indicate the corresponding state transitions and the associated values represent the transition probabilities. For instance, all the transitions to mean that the forked chains boil down to the unanimous public chain and a new round of selfish mining starts. Denote by the steady state distribution of . Denote by (resp. , ) the average number of valid blocks mined by Alice (resp. Bob, Henry). Using the standard approach, we obtain as follows [5].
(1) 
(2)  
(3)  
(4)  
When is large (e.g. three or four), the finite state machine becomes more complicated. Due to limite space, we leave the detailed analysis in the technical report [6], while only presenting the closeform results with .
(5)  
(6)  
(8)  
Note that the cases with are not considered in the modeling. Apart from their complexity, a large may cause a lot of consecutively orphaned blocks so that the selfish mining can be easily detected. Later on, our simulation confirms convergence of profitable threshold at , i.e. the difference between and a large enough is very small.
IiiB Transient State Analysis
According to the data from [4], the Hashrate of the Bitcoin system grows exponentially. It is necessary to study the transient behavior of an attack. We model the action during one difficulty adjustment period and explore the relationship between the number of periods and the attackers’ Hashrate.
For a better description, we define the concept of absolute revenue and relative revenue. First, Alice’s, Bob’s and Henry’s relative revenue are the proportion of their revenue to total revenue, which are
(9)  
(10)  
(11) 
Since we ignore the influence of transaction fee and other factors, miners can only get revenue from published blocks. Based on this, we define absolute revenue as the number of valid blocks obtained per unit of time. In Bitcoin system, we take 10 minutes as unit time.
Through the state machine, in the adjustment interval (eg. difficulty adjustment period), blocks will appear on the longest public chain and blocks are mined totally during one attack round (eg. from stage(0,0,0) back to stage (0,0,0)). In addition, we use to represent the total time spented in the adjustment interval. Considering the change of computing power, the is used to represent the Hashrate of total system, the and the represent the theoretical time and the actual time that is spent mining one block during the adjustment interval respectively. Take Alice as an example, we can obtain the following equations:
(12) 
After the first period, the system will adjust the difficulty to satisfy mining one block per ten minutes. We can obtain the new average time of blocks generation during the period. Alice’s absolute revenue can be expressed as Eq. (15):
(15) 
(16) 
(17) 
(18) 
Iv simulation results
In order to verify the validity of theoretical analysis, we compare them with the results of a Bitcoin system simulator in this section. We set the block generation process to be exponentially distributed and run the simulator a million times. Based on the simulation results and theoretical results, we phrase the following observations:
Observation 1
When there are multiple attackers in Bitcoin system, the attackers’ minimum profitable thresholds decrease and the system security is degraded.
When there is only one attacker in system, [2] proposed that when there are branches, if , the profitable threshold for attacker is 25% . [3] shows when there are two attackers with same Hashrate, the profit threshold will be lower than 25% and it is easier to launch selfishmining. We model this process with state machine shown in section III and the mathematical model verify this conclusion well.
We consider the situation that and . Driving Eq. (10), we can obtain that when Alice’s Hashrate is 16%, Bob’s profitable threshold can reach the minimum: 21.06%. When Alice’s Hashrate is less than 16%, the derivative of Eq. (10) is greater than 0, which means that Bob’s threshold relative to Alice’s Hashrate is monotonically decreasing. When Alice’s Hashrate is more than 16%, the derivative of Eq. (10) is less than 0, which means that Bob’s threshold relative to Alice’s Hashrate is monotonically increasing. In Figure 7, the blue curve represents the theoretical result and the red dots represent the simulation results. Three blue curves represent three cases: is 2, 3 and 4. We can observe that when Alice’s Hashrate is around 16%, Bob’s threshold can be minimum. Through calculation and simulation, attackers’ profitable threshold is 27%, 23% and 22% when is 2, 3 and 4 respectively if Alice and Bob own the same Hashrate. It shows that when there are two attackers, they can adopt strategies to successfully attack with less than 25% of total Hashrate.
Figure 13 also proves this result. The blue curve and the red curve represent that when there is only one attacker(we call it situation 1) and two attackers(we call it situation 2) in Bitcoin system, the relationship between and threshold. It shows that under same condition, the threshold of situation 1 is always higher than the threshold of situation 2.
After [2] published, people realized that the mining pool with more than 25% of the Hashrate can successfully attack, so the system constrains the Hashrate of the mining pool to defend against the attack. We prove that this is not enough through the state machine model. In fact, it’s much easier to attack than our current cognition. The Bitcoin system is easier to be attacked and its security is much more fragile.
Observation 2
If is no larger than 4, there is a negative correlation between Bob’s lowest profitable threshold and while his revenue and are positively correlated. In the Bitcoin system, whether there is one attacker or two attackers, the profitable threshold will converge with the growth of .
The lowest threshold is decreasing as N becomes larger. We use Figure 10, Figure 10 and Figure 10 to describe the revenue situations. Those three images represent Bob’s revenue when two attackers’ Hashrate are changing separately. Since we consider and
in this current situation, Alice’s and Bob’s revenue are symmetrical. In these figures, blue part is the revenue and purple part highlights the moment Bob can gain additional income from the attack, in other words, the intersection of the blue and the purple parts is the threshold curve in Figure
7.In Figure 13, situation 1 shows that when there is only one attacker, with the increase of , the threshold convergences to 25%. The convergence process tends to be smooth when attacker can own more than 5 private blocks. Situation 2 shows that when there are two attackers in system, the relationship between and threshold is consistent with one attacker, also a convergence process and its convergence speed is much faster. When is 4, it reaches the convergence balance, with threshold at 21.48%. Situation 3 and situation 4 show that when Alice owns 25% and 30% Hashrate, Bob’s threshold will also be a convergence process.
That’s because without destroying the normal operation of the system, Henry’s Hashrate is at majority (this premise will be explained rationality in the next part). Based on this premise, in the real world situation, attackers can have small probability to own many private blocks and always take the leading position. Hiding more private blocks can indeed increase attackers’ revenue. However, a long private chain will easily expose the identity of the attacker, since a longer private chain can make it easier to distinguish it from normal blocks when they are published. On the other hand, without knowing the existence of another attacker, if is large, the risk to lose all it’s private blocks gets higher. For the sake of insurance, the attacker might choose to disclose the number of private blocks to a certain extent to obtain corresponding income. In addition, this strategy can also rule out the impact of doublespending. Based on the above reasons, it is better to publish all private blocks once the length of private chain reaches 4, and start the next round of attack. [7] proposes that if we set up the timeliness of the block, we can effectively resist selfish mining attacks. The convergence of the threshold proves that this method is ineffective in current Bitcoin system, this is because in the current blockchain system, we default to a transaction requiring 6 valid blocks to be confirmed. Unfortunately, the threshold can reach convergence before six blocks.
Observation 3
In order to ensure the attack can proceed normally, must be satisfied
As a counterexample, if Alice has the highest Hashrate, and there is no limit to the length of private chain, Alice can hide her private chain as long as possible. She can stay in the lead in most cases during the attack, which will lead to Alice’s private chain becoming the only valid chain. In this case, Bob and Henry will choose to stop mining to reduce losses. We can speculate that under this circumstance, Alice’s revenue can be close to 100% and her attack actually becomes meaningless. This kind of attack is similar to 51% attack. Simulation results also prove this. In Figure 13, when Alice’s Hashrate is 45%, Bob’s Hashrate is 25% and Henry’s Hashrate is 35%, we obtained situation 1. When Alice’s Hashrate is 45%, Bob’s Hashrate is 35%, and Henry’s Hashrate is 25%, we obtained situation 2. It shows that the longer attacker’s private chain is, the more he can gain. As long as one attacker has the highest Hashrate, this situation could happen, regardless of how many Hashrate other miners have. According to this analysis, it is very meaningful to stipulate that Henry should have the highest Hashrate in the attack model.
Observation 4
The attackers will fail during the first difficulty adjustment period regardless of the attackers’ Hashrate. However, he might gain profit after several periods, which is related to the attackers’ Hashrate.
Assuming two attackers have the same Hashrate, we simulated and obtained Figure 6. relative revenue and absolute revenue are equal within the allowable range of error. Therefore, we can believe that the relative revenue and absolute revenue play the same role in representing benefit.
As Eq. (15) shows, when Alice has more Hashrate, she can get illegal revenue earlier. Figure 13 shows the simulation results match well with the theoretical result. The horizontal axis represents the attack round and the ordinate represents the attackers’ revenue, also the blue curve is theoretical result and the red dots are simulation results. It shows that when attackers’ Hashrate is relatively small, it takes a rather long period to gain profit. That means in the real system, it is a little bit hard to perform attack. If the global Hashrate increase, we can also use this formula to calculate when to stop the attack before we can benefit the most.
V Conclusion
In this paper, we study how the existence of multiple misbehaving pools influences the profitability of selfish mining. By establishing the Markov chain model to describle the action of attackers and honest miners, we can obtain the minimum profitable threshold is symmetric 21.48%. Considering the difficulty adjustment, we model the transient process and discover the negative correlation between the profitable time and the attackers’ mining power.
References
 [1] S. Nakamoto. “ Bitcoin: A peertopeer electronic cash system” , 2008.
 [2] I. Eyal and E. G. Sirer. “Majority is not enough: Bitcoin mining is vulnerable”. In Financial Cryptography and Data Security. Springer, 2014, pp. 436454.
 [3] Q.H. Liu, N. Ruan, et al. “On the Strategy and Behavior of Bitcoin Mining with Nattackers”. Proc. of the Asia Conference on Computer and Communications Security, pp. 357368, 2018.
 [4] https://bitinfocharts.com/comparison/bitcoinhashrate.html

[5]
A. Papoulis, S. U. Pillai. Probability, random variables, and stochastic processes[M]. Tata McGrawHill Education, 2002.
 [6] Technical report. http://medianet.azurewebsites.net/newpage/
 [7] R. Pass, E. Shi “Fruitchains: A fair blockchain”. Proc. of the Asia Conference on Computer Symposium on Principles of Distributed Computing, pp. 315324, 2017.
 [8] A. Sapirshtein, Y. Sompolinsky, A. Zohar. “Optimal selfish mining strategies in bitcoin”. International Conference on Financial Cryptography and Data Security, pp. 515532, 2016.
Comments
There are no comments yet.