On the Performance of Pipelined HotStuff

07/11/2021 ∙ by Jianyu Niu, et al. ∙ The University of British Columbia 0

HotStuff is a state-of-the-art Byzantine fault-tolerant consensus protocol. It can be pipelined to build large-scale blockchains. One of its variants called LibraBFT is adopted in Facebook's Libra blockchain. Although it is well known that pipelined HotStuff is secure against up to 1/3 of Byzantine nodes, its performance in terms of throughput and delay is still under-explored. In this paper, we develop a multi-metric evaluation framework to quantitatively analyze pipelined with respect to its chain growth rate, chain quality, and latency. We then propose two attack strategies and evaluate their effects on the performance of pipelined HotStuff. Our analysis shows that the chain growth rate (resp, chain quality) of pipelined HotStuff under our attacks can drop to as low as 4/9 (resp, 12/17) of that without attacks when 1/3 nodes are Byzantine. As another application, we use our framework to evaluate certain engineering optimizations adopted by LibraBFT. We find that these optimizations make the system more vulnerable to our attacks than the original pipelined HotStuff. Finally, we provide two countermeasures to thwart these attacks. We hope that our studies can shed light on the rigorous understanding of the state-of-the-art pipelined HotStuff protocol as well as its variants.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

In 2008, Nakamoto invented the concept of blockchain, a mechanism to maintain a distributed ledger for the cryptocurrency Bitcoin [23]. The core novelty behind blockchain is Nakamoto Consensus (NC), an unconventional (at that time) synchronous Byzantine fault-tolerant (BFT) consensus [12, 29]. Despite the huge impact of Bitcoin, NC suffers from long confirmation latency and low transaction throughput, both of which hinder the original blockchain to support Internet-scale applications. For example, Bitcoin today can only process up to seven transactions per second with a confirmation latency of hours. On one hand, the long latency is a result of the probabilistic safety guarantee and no finality: a short latency cannot guarantee high confidence that a transaction has been confirmed. On the other hand, the low throughout is mainly due to the speed-security tradeoff: a higher transaction throughput leads to more severe forking, which greatly reduces the honest computation power against adversaries, making the system less secure [32].

One promising approach to addressing these dilemmas is leveraging the classical BFT consensus [31, 7], which is also referred to as BFT state machine replication (SMR) and has been extensively studied for the last few decades. Unlike NC, classical BFT protocols can provide a strong safety guarantee. That is, once a transaction is confirmed, it will stay there forever. Hence, clients do not need long waiting periods to confirm transactions (which means a shorter transaction latency), and transaction processing does not need to be compromised with security (which implies a higher throughput). For example, experiments have demonstrated that PBFT [7], a pioneer BFT protocol, can proceed tens of thousands of transactions per second in a LAN [4] and have only hundreds of milliseconds latency in a WAN [33]. Despite all of these advantages, it is technically challenging to apply classical BFT protocols in a blockchain setup. First, the classical BFT protocols have a high message complexity (e.g., message complexity for committing one block), so the number of participants is usually less than dozens [1, 14]. In other words, classical BFT protocols suffer from scalability issues and cannot support a large-scale blockchain. Second, classical BFT protocols are notoriously difficult to be developed, tested, and proved [21, 14, 35]. Finally, classical BFT protocols rarely consider fairness among leaders111 Blockchain systems usually reward leaders with some self-issued tokens for incentivizing protocol participation [23]. Hence, leadership fairness is the foundation of such an incentive mechanism to fairly reward nodes.. In most leader-based BFT protocols [7], a node can serve as a leader as long as it behaves well. This is also called stability-favoring leader rotation [9], for this mechanism can avoid the message complexity in the leader rotation.

HotStuff proposed by Yin et al. [37] is a state-of-the-art BFT consensus, which leverages the community’s advances in the last several decades and achieves the strong scalability, prominent simplicity, and good practicability for large-scale applications like blockchains. HotStuff creatively adopts a three-phase commit rule (rather than the two-phase commit rule used in classical BFT [7]) to enable the protocol to reach consensus at the pace of actual network delay222This property is called responsiveness in the literature. and leverages the threshold signature to realize linear message complexity. HotStuff can be further pipelined, which enables a frequent leader rotation and leads to a simple and practical approach to building large-scale blockchains. Due to these salient properties, Facebook adopts a variant of pipelined HotStuff called LibraBFT [2] for its global payment system, Libra blockchain, which aims to thrive fintech innovations and to enable billions of consumers and businesses to conduct instantaneous, low-cost, highly secure transactions.333Pipelined HotStuff is also referred to as chained HotStuff [37]. In addition, Dapper Lab describes how to deploy HotStuff in its Flow platform [17], and Cypherium Blockchain[15] combines HotStuff with NC together to build a permissionless blockchain. Although it is well known that pipelined HotStuff is secure against up to of Byzantine nodes, its performance in terms of throughput and delay is still under-explored.

In this paper, we first develop a multi-metric evaluation framework to quantitatively analyze pipelined HotStuff’s performance with respect to its chain growth rate, chain quality, and latency. We then propose several attack strategies and evaluate their effects on the performance by using our framework. In addition, we use our framework to evaluate some engineering optimizations adopted by LibraBFT. We find that these optimizations make the system more vulnerable to certain attacks compared with the original pipelined HotStuff. Finally, we provide two countermeasures to thwart these attacks. We hope that our studies can shed light on the rigorous understanding of the state-of-the-art pipelined HotStuff protocol as well as its variants. Our contributions can be summarized as follows:

  • [leftmargin=*]

  • We develop a multi-metric evaluation framework and leverage it to evaluate the impact of several new attacks. Our analysis shows that the chain growth rate (resp, chain quality) of pipelined HotStuff under these attacks can drop to (resp, ) of that without attacks when nodes are Byzantine.

  • We use our framework to evaluate some engineering optimizations adopted by LibraBFT. We find that in pipelined HotStuff, an adversary controlling corrupted nodes can increase the latency to rounds on expectation (x times of the latency without attacks), however, with the same condition, the adversary in LibraBFT can increase the latency to rounds.

  • We propose two countermeasures against our attacks, which can reduce the latency by rounds, improve the chain growth rate by x times, and chain quality by x times.

  • We develop a proof-of-concept implementation of pipelined HotStuff to validate our theoretical findings.

Ii System Model and Preliminaries

Ii-a System Model

We consider a system with nodes denoted by the set . We assume a public-key infrastructure (PKI), and each node has a pair of keys for signing messages (e.g., blocks and votes). We assume that a subset of nodes is Byzantine, denoted by the set , and can behave arbitrarily. The other nodes in are honest and strictly follow the protocol. In order to ensure security, we have . We use (resp. ) to denote the fraction of Byzantine (resp. honest) nodes. That is,

. For simplicity, all the Byzantine nodes are assumed to be controlled by a single adversary, which is computationally bounded and cannot (except with negligible probability) forge honest nodes’ messages.

We assume honest nodes are fully and reliably connected, i.e., every pair of honest nodes is connected with an authenticated and reliable communication link. We adopt the partial synchrony model of Dwork et al. [10]. In the model, there is a known bound and an unknown Global Stabilization Time (GST), such that after GST, all message transmissions between two honest nodes arrive within a bound . Hence, the system is running in synchronous mode after GST and asynchronous mode if GST never occurs.

Ii-B Preliminaries

Quorum Certificate. A block’s quorum certificate (QC) is proof that more than nodes (out of ) have signed this block. Here, a QC could be implemented as a simple set of individual signatures or a threshold signature. We say a block is certified when its QC is received and certified blocks’ freshness is ranked by their round numbers. In particular, we refer to a certified block with the highest round number that a node knows as the newest certified block. Each node keeps track of all signatures for all blocks and keeps updating the newest certified block to its knowledge.

Block and Block Tree. Clients send transactions to leaders, who then batch transactions into blocks. A block has four-tuple , where denotes the round number at which the block is proposed, is a batch of transactions, is the QC for the parent block, and is the block owner’s signature of . Every block except the genesis block must specify its parent block and include a QC for the parent block. In this way, blocks are chained. (Note that in Bitcoin [23], blocks are chained through hash references rather than QCs.) As there may be forks, each node maintains a block tree (referred to as ) of received blocks.

Fig. 1: Overview of a sequence of the leader-based round (LBR) instances. The leader-base phase is in charge of driving progress, while the round-change phase is to synchronize nodes to the same round.

Ii-C Leader-based Round Abstraction

Pipelined HotStuff is executed into a sequence of rounds, and each round has a designated leader444Rounds are also referred to as views [7, 37], terms [28]

, instance values/epoch 

[18] or ballot numbers in the literature. Each round can be further divided into two phases: ) leader-based phase in which a designated leader proposes a new block and collects votes from other nodes to form a quorum certificate of this block (introduced shortly), and ) round-change phase in which nodes safely wedge to next round if the current-round leader is faulty or no certified block is generated before the timeout, as shown in Fig. 1. By following recent work [34], we assume that each round can be encapsulated in a leader-based round (LBR) abstraction, which provides two important modules: pacemaker and leader-election modules. The pacemaker module can guarantee that honest validators are synchronized to execute the same rounds for sufficient overlap, and leaders propose a block that will be supported by honest nodes when the network is synchronous [37]. That is, an honest leader can send its block proposal to all the other honest nodes and receive their votes in one round. The leader-election module can guarantee that nodes are fairly elected as leaders. That is, during synchronous periods, each node has the same chance to win the leadership for one round555During asynchronous periods, some honest nodes may not be synchronized to the highest round and participate in the leader-election. Thus, the adversary can have a higher chance than to be elected as leaders.. For convenience, a leader who is elected from honest nodes (resp. Byzantine nodes) is referred to as honest (resp. adversarial) leader. In the same way, a block proposed by an honest (resp. adversarial) leader is referred to as honest (resp. adversarial) block.

Iii Pipelined HotStuff and LibraBFT Algorithms

Iii-a Pipelined HotStuff Algorithm

We describe the leader-based phase of pipelined HotStuff at round . In the beginning, a unique leader, called , is randomly elected by the leader-election module. The leader’s identity is known and can be verified by all nodes (see Sec. II-B). Then, the leader proposes a block to extend the newest certified block it has seen666For simplicity, we follow the same way with LibraBFT, i.e., leaders extend the predecessor block with a direct child. However, in pipelined HotStuff, a leader has to include dummy blocks in its proposal if there are no certified blocks generated in previous rounds., and broadcasts this block to all other nodes. Every node votes for the first block it receives from the leader777Recall that an adversarial leader can propose multiple blocks., as long as the block satisfies certain conditions introduced shortly. A vote for a block is a signature on the block. When the leader receives at least unique signatures (including its own signature), it aggregates these signatures into a QC and sends it to the next-round leader, namely, .

Fig. 2: The chained blocks in pipelined HotStuff. Blocks , , and are three consecutive blocks, and nodes will commit block when receiving block .

We are now ready to describe the condition for voting. Every node maintains two local parameters: ) , the last round for which the node has voted, and ) , the highest known round number of the grand-parent block that has been received. For example, in Fig. 2, when a node first receives the block and votes for it, its is updated to , and the newest grandparent block is , and so is . After receiving block , the node updates to and to . A node will vote for the first block proposed by the current-round leader if the block extends the block generated at (regardless of round numbers) or the round number of the received block’s parent block is greater than . Concretely, a block satisfies the above condition if and only if: ) its round number is greater than , and ) the round number of its parent block is greater than or equal to .888This condition is adopted by LibraBFT and is shown to be equivalent to the previous condition [2, 3].

After voting for a new block, a node will insert the block to its and then update its state as follows: )update to the round number of this block, and ) update the node’s to the round number of this block’ grandparent block if the latter is higher. Meanwhile, the node checks whether there are new committed blocks in its . Specifically, if there are three blocks , and proposed in three consecutive rounds , , and , and an additional block extends block , the node will commit block and all its predecessor blocks. The first three consecutive blocks are referred to as -direct chain999A -direct chain requires an additional block extending consecutive blocks. If we only have consecutive blocks, we don’t call them a -direct chain. Note that for simplicity of analysis, the -direct chain is different from the Three-Chain defined in  [37], but the differences do not affect the results.. A simple case, in which block is committed, is shown in Fig. 2. Note that nodes maintain a chain containing committed blocks, and this chain is referred to as the main chain in our later analysis.

To sum up, in pipelined HotStuff, leaders propose new blocks, and every node votes for a new block according to certain voting conditions and sometimes commits blocks if there is a -direct chain followed by another block. Also, every node starts a timer to track progress for each round. Whenever timeout happens or a block’s QC is received, a node moves to the next round. Such a round synchronization procedure is provided by the pacemaker module. In fact, the pacemaker module can also guarantee a new leader to have the newest certified block and/or its parent block, which guarantees the leader can propose a block voted by honest nodes (see Sec. II-B).

Iii-B LibraBFT Algorithm

LibraBFT is a variant of pipelined HotStuff with two subtle differences. First, in LibraBFT, a node sends its vote directly to the next-round leader (rather than the current-round leader) so that the next leader can form a QC and embed the QC in its own block. In this way, the current leader doesn’t need to relay the QC to the next leader. That is, this optimization can cutoff the delay of the relay operation.101010The latest version of HotStuff also adopted this optimization [36]. Second, LibraBFT introduces a new block type called Nil block. This is, when the timer expires, and nodes have not received a proposal for the round, they can broadcast a vote on a Nil block (in a predetermined format). If more than nodes have voted, the aggregated signatures can serve as a QC for the Nil block. The certified Nil block can guarantee that blocks are produced in consecutive rounds despite having faulty leaders, which can further accelerate the block commitment. For example, in Fig. 2, when the leader of round is faulty, and there is no block produced, nodes cannot commit block even if receiving block in pipelined HotStuff. By contrast, in LibraBFT, there will be a certified Nil block at round , and nodes will commit block after receiving block .

Iv Performance Metrics and Attacks

In this section, we first introduce a multi-metric framework to evaluate the impact of various attacks and then propose several attack strategies.

Iv-a Performance Metrics

We focus on three performance metrics, namely, chain growth rate, chain quality, and latency. All of these metrics are meaningful only after the GST, which implies that the network is in synchronous mode111111Before the GST, there may have no certified blocks at all, and so the three metrics become meaningless..

Iv-A1 Chain Growth Rate

For a given adversarial strategy that controls a fraction of total nodes, the chain growth rate is defined as the rate of honest blocks appended to the main chain over the long run. Let denote the total number of honest blocks appended to the main chain in rounds. (Note that

can be a random variable because of the randomness in the leader selection.) We have:

(1)

The chain growth rate corresponds to the liveness in the context of blockchains.

Iv-A2 Chain Quality

For a given adversarial strategy that controls a fraction of total nodes, the chain quality is defined as the fraction of honest blocks included in the main chain over the long run. Let denote the total number of adversarial blocks appended to the main chain in rounds. We have:

(2)

This metric affects the reward distribution. In blockchains, each block in the main chain brings its proposer a reward [23]. This reward incentivizes nodes to participate in the consensus and compete to win the leadership. Additionally, nodes are expected to get rewards, proportional to their devoted resources (e.g., hash power in Poof-of-Work [23], and stakes in Proof-of-Stake [13]). Intuitively, a chain quality less than implies that the adversary can win a higher fraction of rewards than what it deserves, which ruins the incentive compatibility [30, 11, 24, 27].

Iv-A3 Latency

For a given adversarial strategy that controls a fraction of total nodes, the latency is defined as the average rounds that honest blocks take from being included in the main chain until being committed over the long run. Let denote the number of rounds that the -th honest block takes to be committed by all honest nodes (rather than some honest leaders) during the rounds.121212Note that leaders first commit blocks locally, and then send out the proofs of the -direct chain to convince other nodes to commit these blocks. It is trivial to extend our analysis to get the results under this model. We have:

(3)
Remark 1.

Note that the chain growth rate and latency are measured in terms of rounds rather than in time. This round abstraction allows us to ignore the specific implementation of the pacemaker module and focus on the core of pipelined HotStuff.

Iv-B Attack Strategies

We introduce two attacks here. The forking attack launched by the adversary aims to minimize the chain growth rate and the chain quality by overriding honest blocks. The delay attack aims to maximize the latency by delaying the commitment of honest blocks. Both attacks, which are inspired by the selfish mining attack for Bitcoin, are new in the context of pipelined HotStuff. The optimality of these attacks will be discussed in a journal version of this work. Here, we emphasize that since the performance metrics are measured after the GST, we do not need to consider network-level attacks, such as eclipse attack[16] and Distributed Denial of Service (DDoS) attack, which may cause a network partition (i.e., asynchronous network) and blocking the progress.

Fig. 3: The forking attack on pipelined HotStuff. The adversary is elected as a leader in round . It proposes a block after block (or ) to override blocks and (or block ).

Iv-B1 Forking Attack

In pipelined HotStuff, an adversarial leader can create forking blocks on purpose to override honest blocks without any loss. For example, in Fig. 3, if blocks and are both honest blocks, and the adversarial leader at round has no adversarial certified block with a round number larger than , the adversarial leader will build a block on block . As block satisfies the voting condition (i.e., the round number of ’s parent block is no less than honest nodes’ ), nodes will vote for and all subsequent leaders will extend . Similarly, if only block is an honest block, the adversary can build on block to override this block. In both cases, the adversarial leader overrides some honest blocks and suffers no loss of adversarial blocks. Also, note that the adversarial leader cannot override block and its predecessor blocks, since block cannot reference a certified parent block, which has a round number no less than . In general, if there exist some adversarial certified blocks with round numbers no less than , the adversarial leader extends the newest adversarial certified block. Otherwise, the adversarial leader extends the block produced at .

Iv-B2 Delay Attack

The main goal of the delay attack is to break the block commitment condition in order to increase the average delay of honest blocks in the main chain. More specifically, the delay attack is to prevent honest blocks to form the -direct chain. Recall that a block is committed if and only if three blocks extend it, and the first two blocks are produced in consecutive rounds after the block’s round (i.e., the -direct chain structure). Note that once the block is committed, all its predecessor blocks are also committed.

Delay Attack in pipelined HotStuff. Recall that an honest leader always proposes a block on the newest certified block. By contrast, an adversarial leader can propose a block on a non-newest certified block or propose no block at all. An example is provided in Case A of Fig. 4 in which the adversarial leader of round observes three non-consecutive blocks , , and . In this case, the leader proposes no block at all (leading to a timeout). As a result, subsequent honest leaders have to restart building a -direct chain. Another example is illustrated in Case B of Fig. 4 in which the adversarial leader of round observes three consecutive blocks , , and . In this case, the leader proposes block on top of . This will override block as explained before. In general, if there exist three consecutive blocks that ended with the newest certified block, the subsequent adversarial leader overrides the newest certified block. Otherwise, the adversarial leader proposes no block.

Fig. 4: The delay attack on pipelined HotStuff. The adversary proposes no block at all (in Case A) or overrides honest blocks at round (in Case B) according to whether there exists three consecutive blocks.

Delay Attack in LibraBFT. Delay attack in LibraBFT is slightly different from that in pipelined HotStuff. First, due to the Nil block, the adversary cannot propose no block, because otherwise a certified Nil block will be produced that can be part of a -direct chain. Hence, the adversarial leader can create a block, and then send this block to half of the honest nodes. In this way, half of the honest nodes will vote for this block, while the left half honest nodes will vote for the Nil block. As a result, neither a certified block nor a certified Nil block will be produced in this round. Second, in LibraBFT, as nodes send block votes to the next leader, an adversarial leader can hide the collected QC of a block proposed in the previous round. So, when there are three consecutive blocks, the adversarial leader can hide the QC for the last one. In this way, subsequent honest leaders cannot form a -direct chain based on the three consecutive blocks.

V Performance Analysis under Forking and Delay Attacks

In this section, we first analyze the performance of pipelined HotStuff in terms of chain growth rate, chain quality, and latency under forking and delay attacks. Then, we evaluate some optimizations adopted by LibraBFT. Specifically, we consider a sequence of rounds (when the network is in synchronous mode) and number these rounds as . For each round, the possibility that the elected leader is honest (resp, adversarial) is (resp, ). Now, Let (

) denote an indicator random variable which equals one if the leader of the

th round is honest and equals zero otherwise. As the network is synchronous, nodes can receive a block within time after an honest leader sends the block. For simplicity, nodes are assumed to receive the block by the end of each round. In addition, honest leaders are assumed to be able to get the newest honest certified blocks from other nodes before proposing new blocks.

V-a Performance Analysis of Pipelined HotStuff

V-A1 Chain Growth Rate

Recall that an honest block proposed at round will be overridden by an adversarial leader in round or under the forking attack (See Sec. IV-B for details.). In other words, an honest block can be kept in the main chain if and only if the subsequent two blocks are honest blocks. Let denote an indicator random variable, which equals to one if and equals to zero otherwise. Next, let . The following lemma bounds the value of .

Lemma 1.

For consecutive rounds, the number of block fragments has the following Chernoff-type bound: For ,

(4)
Proof.

Without loss of generality, we assume that is a multiple of . Let (). Then, . It is easy to show that , since . Note that are independent random variables, because is a function of . Hence, is a sum of i.i.d. random variables. By Lemma  in [26], we have

Similarly, we have . ∎

This lemma shows that as increases, the number of block fragments is between and with high probability. Moreover, each block fragment corresponds to one honest block included in the main chain. This leads to the following theorem for the chain growth rate.

Theorem 1.

The chain growth rate of pipelined HotStuff under the forking attack converges to with high probability as .

Proof.

By Lemma 1 and the definition of , we have . ∎

Note that when there does not exist the forking attack, the chain growth rate is , for the probability that an honest node is elected as a leader is . In other words, this theorem states that the forking attack reduces the chain growth rate from to . For instance, if , the chain growth rate is reduced from to .

Remark 2.

The chain growth rate is measured in terms of rounds. If it is measured in time, the forking attack can reduce the chain growth rate even more. This is because an adversarial leader can push the duration of its round close to the timeout value, which is usually much longer than the actual network delay for producing a certified block.

V-A2 Chain Quality

Recall that the adversary suffers no loss of blocks when launching the forking attack. That is, every adversarial block can be kept in the main chain. Therefore, the adversary can produce adversarial blocks on expectation over rounds. This observation, together with Theorem 1, allows us to derive the following chain quality theorem for pipelined HotStuff.

Theorem 2.

The chain quality of pipelined HotStuff under the forking attack converges to with high probability as .

Proof.

As , (the number of adversarial blocks divided by ) converges to by Lemma 3 in [26], and (the number of honest blocks in the main chain divided by ) converges to by Lemma 1. Hence, the chain quality converges to . Note that . ∎

For example, if , the chain quality under the forking attack is , whereas it should be without the forking attack. If each block can bring its owner a reward, the adversary can obtain a fraction of rewards by launching the forking attack, which is always higher than the deserved fraction , for . In other words, incentive compatibility of pipelined HotStuff cannot hold anymore under the forking attack.

Fig. 5: The state transition of the delay attack on pipelined HotStuff.

V-A3 Latency

Here, we compute the latency of honest blocks in the main chain. To achieve this goal, we need to track each honest block included in the main chain and obtain its associated delay to be committed. More precisely, the chance that an honest block is kept in the main chain and its delay is affected by the delay attack strategies, which further depend on the chain structure. For example, in Fig. 4, as shown in Case A, as there is no -chain structure of the latest blocks, the adversary proposes no block and honest blocks and are kept in the main chain; however, they will be overridden in Case B. Therefore, we need to track the previous block structure of every honest block. To this end, we define four states as follows:

  • [leftmargin=*]

  • : the state where the previous round is a timeout and no certified block is produced;

  • for : the state where there exists consecutive blocks that are not committed yet.

Under the delay attack described in Sec IV-B

, we can develop a Markov model of state transitions in Fig. 

5. Recall that (respectively, ) is the probability that an honest (respectively, adversarial) leader proposes a new block. Each transition denotes a new round and there exists a designated leader. An honest leader always proposes a new block that extends the newest certified block (denoted as the red line in Fig. 5). This Markov model allows us to track the previous chain structure for any new honest block as well as to obtain its chance to be included in the main chain and the associated delay. We have the following theorem on the delay of pipelined HotStuff.

Theorem 3.

The latency of pipelined HotStuff under the delay attack converges to with high probability as .

Proof.

First, by solving the above Markov model, we can obtain the steady-state distribution of each state as follows:

Next, we can analyze the latency of honest blocks in each state transition. In particular, we focus on the blocks that eventually end up in the main chain.

  • [leftmargin=*]

  • Case : . All proposed honest blocks will be kept in the main chain, and their average delay is .

  • Case : . The honest blocks have probability to be committed with an average delay , probability to be committed with an average delay , and probability to be committed with an average delay .

  • Case : and . The honest blocks have probability to be committed with an average delay and probability to be committed with an average delay .

Due to space constraint, the detailed proofs of these cases are provided in Appendix B1 of our technical report [26]. With these results, it is easy to obtain the latency as:

(5)

This completes the proof. ∎

The theorem shows that when the adversary controls of Byzantine nodes (i.e., and ), the average latency for committing one block under the delay attack is about rounds. By contrast, without the delay attack, a block is committed if the next three consecutive blocks extend it with a latency of rounds.

V-B Performance Analysis of LibraBFT

We analyze the performance of LibraBFT under the attack strategies. In particular, we will evaluate the differences between pipelined HotStuff and LibraBFT. These differences made by LibraBFT aim to cut off the delay of relaying blocks’ QCs or fasten the block commitment (see Sec. III).

On the one hand, as the forking attack strategies are the same, LibraBFT has the same chain growth rate and chain quality as pipelined HotStuff. In other words, these changes do not affect these two metrics. On the other hand, we can develop a similar Markov model to analyze the delay attack in LibraBFT as shown in Fig. 6. This allows us to obtain the following theorem on the latency for LibraBFT.

Fig. 6: The state transition of the delay attack on LibraBFT. The red line denotes a different action adopted by the adversarial leader in LibraBFT.
Theorem 4.

The latency of LibraBFT under the delay attack converges to with high probability as .

Proof.

First, by solving the above Markov model, we can obtain the steady-state distribution of each state as follows:

Next, we can analyze the latency of honest blocks in each state transition. We detail honest blocks on each event below.

  • [leftmargin=*]

  • Case : and . All honest blocks produced after a previous timeout round or just one consecutive block will be kept in the main chain, and their average delay is .

  • Case : and . The honest blocks have probability to be committed with an average delay and probability to be committed with an average delay .

Due to space constraint, the detailed proofs of these cases are provided in Appendix B2 of our technical report [26]. With these results, it is easy to obtain the latency as:

(6)

This completes the proof. ∎

The theorem shows that when , the average latency for committing one block under the delay attack is about rounds. Compared with the delay in pipelined HotStuff, it suggests that the mechanism of sending votes to the next-round leader makes the system more vulnerable against the delay attack.

Vi Countermeasures

Vi-a Broadcasting QCs

The first countermeasure is that current-round leaders broadcast QCs to all nodes (rather than just relaying QCs to next-round leaders)131313An earlier version of LibraBFT has adopted this mechanism.. Broadcasting QCs can provide two benefits. First, when nodes receive QCs, they can update their , which can effectively thwart the forking attack. For example, in Fig. 7, when honest nodes receive QC for block , they can update from to . As a result, the adversarial leader of round can only override block ; however, without this mechanism, the adversarial leader can override both blocks and . Second, broadcasting QCs can fasten block commitments. Specifically, when nodes observe a -direct chain of , , and , as well as block ’s QC, they can commit block , as shown in Fig. 7. By contrast, in pipelined HotStuff, without broadcasting QCs, nodes need to wait until the subsequent block carrying block ’s QC is received and then commit block . As said in Sec. V-A3, this enables an adversarial leader at round to hide ’s QC and ruin the -direct chain to increase block delay. Additionally, even in the ideal case, broadcasting QC can accelerate the block commitment; the latency for committing one block is reduced to two rounds. In the following, we provide a formal analysis of the improvement in chain growth rate, chain quality, and delay.

Fig. 7: The committing rule with broadcasting QCs in Pipelined HotStuff.

Vi-A1 Chain Growth Rate and Chain Quality

With broadcasting QCs, an honest block proposed at round can only be overridden by an adversarial leader at round . In other words, an honest block can be kept in the main chain if and only if the subsequent leader is an honest leader. Let denote an indicator random variable, which equals to one if and equals to zero otherwise. Next, let . By following our previous analysis, we can bound the value of :

Lemma 2.

For consecutive rounds, the number of block fragments has the following Chernoff-type bound: For ,

(7)
Proof.

The analysis is very similar to that for Lemma 1. ∎

With this lemma, we can easily derive the following theorems on chain growth rate and chain quality, respectively.

Theorem 5.

The chain growth rate of pipelined HotStuff with broadcasting QCs under the forking attack converges to with high probability as .

Theorem 6.

The chain quality of pipelined HotStuff with broadcasting QCs under the forking attack converges to with high probability as .

These two theorems can be easily proved by following our previous analysis for pipelined HotStuff. Due to space constraints, we do not provide the proofs here. When , the chain growth rate improves by x, while the chain quality improves by x.

Vi-A2 Latency

Following our previous analysis, we can develop a Markov model of the delay attack in Fig. 8. Note that as the adversarial leader always chooses to propose no blocks, all honest blocks can be kept in the main chain. Moreover, by solving the above Markov model, we have the following theorem.

Theorem 7.

The latency of pipelined HotStuff with broadcasting QCs under the delay attack converges to with high probability as .

Proof.

All produced honest blocks will be kept in the main chain, and their average delay is . Due to space constraint, the detailed proofs of these cases are provided in Appendix B3 of our technical report [26]. ∎

This theorem shows that when , the average latency for committing one block under the delay attack is rounds. It suggests that broadcasting QCs can reduce the average block latency by almost rounds. Note that broadcasting QCs also brings additional delay. Therefore, it is a design tradeoff, which should be evaluated in real settings in order to decide whether to adopt it.

Fig. 8: The state transition of the delay attack on pipelined HotStuff with broadcasting QCs.

Vi-B Longest Chain Rule

Our second countermeasure is changing the block proposing rule. In pipelined HotStuff, an honest node always extends the newest certified block. This deterministic block proposing rule enables the adversary to override at most two previous honest blocks by a higher certified block without any loss (i.e., decreasing chain growth rate and chain quality) and to break the

-direct chain (i.e., increasing latency). Therefore, we suggest that nodes can choose to extend the longest certified blockchain In particular, when there are two forking branches with the same length, they randomly choose one to extend. This randomized block proposing rule is inspired by the longest chain rule in NC. A detailed analysis of this countermeasure will be provided in a journal version of this work.

Vii Evaluation

We implement a proof-of-concept of pipelined HotStuff to evaluate its performance in terms of chain growth rate, chain quality, and latency under the forking and delay attacks.

Vii-a Testnet Setup

We consider a system of nodes, and the number of Byzantine nodes is up to . For simplicity, nodes are set to have synchronized clocks, and so the protocol proceeds in synchronized rounds141414In a partially synchronous network, nodes can establish a synchronized clock as long as they have clocks with bounded drift [10].. We build a full-fledged implementation of pipelined HotStuff using Golang (around LoC). We run the simulation on a late 2013 Apple MacBook Pro, 2.7GHz Intel Core i7. In our experiments, the adversary runs the attack strategies in Sec. IV-B. Our simulation results are based on an average of runs, where each run generates 100,000 blocks.

(a) The chain growth rate under the forking attack.
(b) The chain quality under the forking attack.
(c) The average latency of honest blocks under the delay attack.
Fig. 9: The performance of pipelined HotStuff as well as LibraBFT under the forking and delay attacks.

Vii-B Pipelined HotStuff and LibraBFT

We evaluate the performance of pipelined HotStuff as well as LibraBFT through extensive experiments. Since LibraBFT has the same chain growth rate and chain quality as pipelined HotStuff, these two performance metrics are only given for pipelined HotStuff.

Vii-B1 Chain Growth Rate

Fig. 9(a) shows the chain growth rate of pipelined HotStuff with different fractions of Byzantine nodes. First, we observe that the simulation results well match the analysis results. Second, the results show that as the fraction of Byzantine nodes increases, the gap between the chain growth rates with and without the forking attack also increases. When is close to , the chain growth rate under the attack can drops to almost half of that without attacks.

Vii-B2 Chain Quality

Fig. 9(b) shows the chain quality of pipelined HotStuff with different fractions of Byzantine nodes. First, the evaluation results match our previous analysis. Second, The results show that by the forking attack, the adversary can lower the chain quality and obtain a higher fraction of blocks in the main chain than what it deserves. If each block in the main chain brings to its owner a reward, this implies that the adversary can always gain a higher fraction of rewards. In other words, the incentive compatibility cannot be held anymore under the attack. For example, when is close to , the chain quality drops to . This implies that of Byzantine nodes can produce almost half of the blocks in the main chain (and obtain half of the rewards).

Vii-B3 Latency

Fig. 9(c) shows the average latency of honest blocks in the main chain in pipelined Hotstuff and LibraBFT. First, the evaluation results, once again, validate our analysis. Second, the results show that as the fraction of Byzantine nodes increases, both the latency in pipelined Hotstuff and LibraBFT increase. In addition, the latency in LibraBFT is larger than that in pipelined HotStuff, which implies that the engineering optimizations adopted by LibraBFT may make it more vulnerable to the delay attack. Note that in both pipelined HotStuff and LibraBFT, the latency for committing one block is three rounds without attacks. Therefore, when is close to , the average latency of honest blocks under the delay attacks is almost x of that without attacks.

Vii-C Countermeasures

We evaluate the performance of pipelined HotStuff with broadcasting QCs. Fig. 10(a) and 10(b) show that as the fraction of Byzantine nodes increases, the gap between the chain growth rates (and chain qualities) of pipelined HotStuff with and without broadcasting QCs also increases. This implies that the higher is, the higher performance improvement that broadcasting QCs brings. Fig. 10(c) shows that the latency of pipelined HotStuff with broadcasting QCs is at least one round shorter than the original HotStuff. In addition, when is close to , the average latency of honest blocks can drop by almost rounds. As previously explained, broadcasting QCs also brings additional delay. Therefore, it is a design tradeoff that should be evaluated in real settings. Finally, we would like to point out that the longest chain rule can also significantly enhance the performance of pipelined HotStuff. Moreover, the longest chain rule brings no additional overhead, and it can be combined with broadcasting QCs. We will present these results in a journal version of this work.

(a) The chain growth rate under the forking attack.
(b) The chain quality under the forking attack.
(c) The average latency under the delay attack.
Fig. 10: The performance of pipelined HotStuff under the forking and delay attacks with and without broadcasting QCs.

Viii Related Work

Reaching consensus in face of Byzantine failures was formulated as the Byzantine agreement problem by Lamport et al. [20], and has been studied for several decades. Various BFT consensus protocols such as PBFT [7], Zyzzyva [19], and Q/U [1] have been proposed. However, these classical BFT protocols suffer from poor scalability, notorious complexity and leader fairness issues (see Sec. I) and are hard to be used in large-scale blockchains. To address these issues, several state-of-the-art BFT protocols [5, 6, 37, 9] are proposed for building large-scale blockchains.

Tendermint. Tendermint [5] features a continuous leader rotation (also called the democracy-favoring leader rotation [9]) based on PBFT protocol. Specifically, Tendermint embeds the round-change mechanism into the common-case pattern, and the leader is re-elected from all the nodes by some desired policy after every block, resulting in better leadership fairness.

Casper FFG. Buterin and Griffith [6] proposed a protocol called Casper FFG, which works as an overlay atop NC to provide “finality gadget”. Casper FFG applies an elegant pipelining idea to the classical BFT protocol, i.e., if each block required two rounds of voting, one can piggyback the second round on the next block’s voting. This pipelining idea enables the system to have one identical round (rather than multiple rounds with different functionalities and names151515In PBFT, for committing one proposal, there are two phases: prepare and commit phases, and each phase has different functionalities.), and so significantly simplifies the protocol design.

Pala and Streamlet. Pala [9] is a simple BFT consensus protocol that also adopts the pipelining idea. However, for high throughput, it uses a stability-favoring leader rotation policy. Based on this work, Chan et al.  [8] proposed Streamlet, which further simplifies the voting rule. Streamlet aims to provide a unified, simple protocol for both teaching and implementation.

HotStuff. HotStuff proposed by Yin et al.  [37] creatively adopts a three-phase commit rule (rather than the two-phase commit rule used in Casper FFG, Pala, and Streamlet) to enable the protocol to reach consensus at the pace of actual network delay. In addition, HotStuff adopts the threshold signature to realize linear message complexity, and can also be pipelined into a practical protocol for building large-scale blockchains.

Fast-HotStuff. Fast-HotStuff [22] has lower latency compared to the HotStuff and is resilient to a forking attack. But unlike HotStuff, Fast-HotStuff adds a small overhead to the block during an unhappy path (when the primary fails).

Ix Conclusion

The state-of-the-art pipelined HotStuff not only provides linear message complexity and responsiveness but also is efficient for building large-scale blockchains. Thus, pipelined HotStuff has been adopted in many blockchain projects such as Libra, Flow, and Cypherium. In this paper, we propose a multi-metric evaluation framework including chain growth rate, chain quality, and latency. We also propose two attacks, namely the forking attack and delay attack, and systematically study the impacts of these two attacks on the performance of pipelined HotStuff. Also, we leverage the framework to evaluate some engineering designs in LibraBFT. Finally, we propose some countermeasures to enhance the performance of pipelined HotStuff against these attacks. We hope that our framework can contribute to proposing new variants of HotStuff as well as making HotStuff more understandable for developers and practitioners in terms of performance.

References

  • [1] M. Abd-El-Malek, G. R. Ganger, G. R. Goodson, M. K. Reiter, and J. J. Wylie (2005-10) Fault-scalable Byzantine fault-tolerant services. SIGOPS Oper. Syst. Rev. 39 (5), pp. 59–74. Cited by: §I, §VIII.
  • [2] S. Bano, M. Baudet, A. Ching, A. Chursin, G. Danezis, F. Garillot, Z. Li, D. Malkhi, O. Naor, D. Perelman, et al. (2020-05) State machine replication in the Libra blockchain. External Links: Link Cited by: §I, footnote 8.
  • [3] S. Bano, A. Sonnino, A. Chursin, D. Perelman, and D. Malkhi (2020) Twins: white-glove approach for BFT testing. Cited by: footnote 8.
  • [4] A. Bessani, J. Sousa, and E. E. P. Alchieri (2014) State machine replication for the masses with BFT-SMART. In 2014 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, Vol. , pp. 355–362. Cited by: §I.
  • [5] E. Buchman (2016-06) Tendermint: Byzantine fault tolerance in the age of blockchains.. M. Eng. thesis, The University of Guelph, Ontario, Canada. Cited by: §VIII, §VIII.
  • [6] V. Buterin and V. Griffith (2017) Casper the friendly finality gadget. Vol. abs/1710.09437. Cited by: §VIII, §VIII.
  • [7] M. Castro and B. Liskov (1999) Practical Byzantine fault tolerance. In Proceedings of the Third Symposium on Operating Systems Design and Implementation, OSDI ’99, Berkeley, CA, USA, pp. 173–186. External Links: ISBN 1-880446-39-1 Cited by: §I, §I, §VIII, footnote 4.
  • [8] B. Y. Chan and E. Shi (2020) Streamlet: textbook streamlined blockchains. Vol. 2020. Cited by: §VIII.
  • [9] T. H. Chan, R. Pass, and E. Shi (2018) PaLa: a simple partially synchronous blockchain.. Vol. 2018. Cited by: §I, §VIII, §VIII, §VIII.
  • [10] C. Dwork, N. Lynch, and L. Stockmeyer (1988) Consensus in the presence of partial synchrony. Journal of the ACM (JACM) 35 (2), pp. 288–323. Cited by: §II-A, footnote 14.
  • [11] I. Eyal and E. G. Sirer (2018-06) Majority is not enough: Bitcoin mining is vulnerable. Commun. ACM 61 (7), pp. 95–102. Cited by: §IV-A2.
  • [12] J. Garay, A. Kiayias, and N. Leonardos (2015) The Bitcoin backbone protocol: Analysis and applications. In Advances in Cryptology - EUROCRYPT 2015, Berlin Heidelberg, pp. 281–310. Cited by: §I.
  • [13] Y. Gilad, R. Hemo, S. Micali, G. Vlachos, and N. Zeldovich (2017) Algorand: scaling Byzantine agreements for cryptocurrencies. In Proceedings of the 26th Symposium on Operating Systems Principles, SOSP ’17, New York, NY, USA, pp. 51–68. External Links: ISBN 978-1-4503-5085-3 Cited by: §IV-A2.
  • [14] R. Guerraoui, N. Knežević, V. Quéma, and M. Vukolić (2010) The next 700 BFT protocols. In Proceedings of the 5th European Conference on Computer Systems, EuroSys ’10, New York, NY, USA, pp. 363–376. External Links: ISBN 9781605585772 Cited by: §I.
  • [15] Y. Guo, Q. Yang, H. Zhou, W. Lu, and S. Zeng (2020-02) Syetem and methods for selection and utilizing a committee of validator nodes in a distributed system. Note: Patent.Cypherium Blockchain External Links: Link Cited by: §I.
  • [16] E. Heilman, A. Kendler, A. Zohar, and S. Goldberg (2015) Eclipse attacks on Bitcoin’s peer-to-peer network. In Proceedings of the 24th USENIX Conference on Security Symposium, USA, pp. 129–144. Cited by: §IV-B.
  • [17] A. Hentschel, Y. Hassanzadeh-Nazarabadi, R. Seraj, D. Shirley, and L. Lafrance (2020) Flow: separating consensus and compute–block formation and execution. Cited by: §I.
  • [18] P. Hunt, M. Konar, F. P. Junqueira, and B. Reed (2010) ZooKeeper: wait-free coordination for internet-scale systems. In Proceedings of the 2010 USENIX Conference on USENIX Annual Technical Conference, USENIXATC’10, USA, pp. 11. Cited by: footnote 4.
  • [19] R. Kotla, L. Alvisi, M. Dahlin, A. Clement, and E. Wong (2007) Zyzzyva: speculative Byzantine fault tolerance. ACM SIGOPS Operating Systems Review 41 (6), pp. 45–58. Cited by: §VIII.
  • [20] L. Lamport, R. E. Shostak, and M. C. Pease (1982) The Byzantine generals problem. ACM Trans. Program. Lang. Syst. 4 (3), pp. 382–401. Cited by: §VIII.
  • [21] J. Mickens (2014)

    The saddest moment

    .
    Login Usenix Mag 39 (3), pp. 52–54. Cited by: §I.
  • [22] J. N. Mohammad M. Jalalzai and C. Feng (2020) Fast-hotstuff: a fast and resilient hotstuff protocol. Cited by: §VIII.
  • [23] S. Nakamoto (2008) Bitcoin: a peer-to-peer electronic cash system. Working Paper. Cited by: §I, §II-B, §IV-A2, footnote 1.
  • [24] J. Niu and C. Feng (2019-07) Selfish mining in ethereum. In 2019 IEEE 39th International Conference on Distributed Computing Systems (ICDCS), Vol. , pp. 1306–1316. External Links: ISSN Cited by: §IV-A2.
  • [25] J. Niu, C. Feng, H. Dau, Y. Huang, and J. Zhu (2019) Analysis of Nakamoto consensus, revisited. Cited by: Lemma 3.
  • [26] J. Niu, F. Gai, M. M. Jalalzai, and C. Feng (2021) On the performance of pipelined hotstuff. Note: Preprint available at https://github.com/infocom2021HotStuffReport/Report Cited by: §V-A1, §V-A2, §V-A3, §V-B, §VI-A2.
  • [27] J. Niu, Z. Wang, F. Gai, and C. Feng (2020) Incentive analysis of Bitcoin-NG, revisited. In Performance Evaluation: An International Journal, Vol. 144, pp. 102144. Cited by: §IV-A2.
  • [28] D. Ongaro and J. Ousterhout (2014-06) In search of an understandable consensus algorithm. In 2014 USENIX Annual Technical Conference (USENIX ATC 14), Philadelphia, PA, pp. 305–319. External Links: ISBN 978-1-931971-10-2 Cited by: footnote 4.
  • [29] R. Pass, L. Seeman, and A. Shelat (2017) Analysis of the blockchain protocol in asynchronous networks. In Advances in Cryptology – EUROCRYPT 2017, Cham, pp. 643–673. Cited by: §I.
  • [30] R. Pass and E. Shi (2017) FruitChains: a fair blockchain. In Proceedings of the ACM Symposium on Principles of Distributed Computing, PODC ’17, New York, NY, USA, pp. 315–324. Cited by: §IV-A2.
  • [31] M. Pease, R. Shostak, and L. Lamport (1980-04) Reaching agreement in the presence of faults. J. ACM 27 (2), pp. 228–234. External Links: ISSN 0004-5411 Cited by: §I.
  • [32] Y. Sompolinsky and A. Zohar (2015) Secure high-rate transaction processing in Bitcoin. In Financial Cryptography and Data Security, Berlin, Heidelberg, pp. 507–527. Cited by: §I.
  • [33] J. Sousa and A. Bessani (2015) Separating the wheat from the chaff: an empirical design for geo-replicated state machines. In 2015 IEEE 34th Symposium on Reliable Distributed Systems (SRDS), Vol. , pp. 146–155. Cited by: §I.
  • [34] A. Spiegelman and A. Rinberg (2019) ACE: abstract consensus encapsulation for liveness boosting of state machine replication. Cited by: §II-C.
  • [35] M. Vukolić (2016) The quest for scalable blockchain Fabric: Proof-of-Work vs. BFT replication. In Open Problems in Network Security, Cham, pp. 112–125. External Links: ISBN 978-3-319-39028-4 Cited by: §I.
  • [36] M. Yin, D. Malkhi, M. K. Reiter, G. G. Gueta, and I. Abraham (2018) Hotstuff: bft consensus in the lens of blockchain. Cited by: footnote 10.
  • [37] M. Yin, D. Malkhi, M. K. Reiter, G. G. Gueta, and I. Abraham (2019) HotStuff: BFT consensus with linearity and responsiveness. pp. 347–356. Cited by: §I, §II-C, §VIII, §VIII, footnote 3, footnote 4, footnote 9.