I Introduction
Byzantine agreement is one of the central problems in the field of distributed algorithms and cryptography. It also plays an important role in multiparty computation and constructing cryptocurrencies.
In 1982, Lamport, Shostak, and Pease [1] introduced the Byzantine general problem: several generals want to make a consensus on whether they should attack or not, while some of them may be malicious.
In this paper, we consider the following setting. Suppose there are users, of which at most may be malicious. The malicious users may deviate from the protocol arbitrarily. Each user starts with an initial value . All the users want to decide on one of the initial values, satisfying the following three conditions:

Agreement. Two honest users never decide on different values.

Termination. All honest users terminate in a finite time.

Validity. The decision value must be the initial value of any node.^{1}^{1}1The original Byzantine general problem only considers the binary case. That is, the initial values can only be or . The validity is defined as if all honest nodes start from the initial value , then all honest nodes must decide on . Here we consider the multivalue case, and we follow the definition in [2].
The protocol that solves such a problem is called Byzantine agreement (BA).
Ia Byzantine Agreement in Blockchain
The blockchain systems allow many mutually untrusted users to maintain a distributed ledger with consensus. However, the long confirmation latency prevents the existing blockchain systems from many daily applications. For example, the confirmation latency of Ethereum is about 5 to 10 minutes. It is unrealistic to wait such a long time for micropayment systems.
Recently, some proposals try to overcome the long latency, but it is challenging to decide who has the right to issue the blocks and to guarantee that every user shares the same ledger. Chen and Micali [3] proposed a novel blockchain system, Algorand, that solves the consensus problem by BA. Pass and Shi [4] also proposed a blockchain system, Hybrid consensus, that reduces the latency by BA. The performance and the security of such blockchain systems highly depend on the underlying Byzantine agreements, so it is imperative to design a secure and efficient Byzantine agreement protocol under the reasonable assumptions for the Internet nowadays.
Fairness
The incentive model plays an essential role in most of the blockchains. It motivates the miners and validators to execute and follow the protocol. It also relates to the issue and the distribution of the currency. Consequently, if we use BA to decide whose block (initial value) is chosen, whether each participant’s value has an equal probability of being chosen becomes essential and directly influences the economics of the blockchain.
However, the notion of fairness is not captured in the traditional security definition (agreement, termination, validity). Therefore, to measure the performance of BA protocols, especially in the context of blockchains, we propose a new definition of the validity, called strongly fair validity. Intuitively, if users join a BA, the BA protocol satisfies strongly fair validity if the probability that one’s value is accepted by some honest nodes is lowerbounded by except a negligible probability.
Synchronous and Asynchronous Network
An extensive literature has studied Byzantine agreement in different network models. In a synchronous network, there is a priori known upperbound of the network delay, while an asynchronous network does not have. For convenience, we call the BA protocols designed for the former model the synchronous BA and the BA protocols designed for the latter model the asynchronous BA.
When applying to the blockchain, asynchronous BAs usually outperform synchronous BAs from two aspects. First, asynchronous BA has better resistance to network failures. Although the network nowadays is highly reliable, network failures happen from time to time. For example, the undersea cables break or the network services shut down for updating. In these cases, the network delay may be much longer than the typical case and the security of a synchronous BA is not guaranteed. Second, the performance of the synchronous protocols is limited by the upperbound . On the other hand, there is no upperbound for the network delay in the asynchronous protocols, so the protocols proceed when enough messages are delivered, which only depends on the actual network delay.
However, the nature that the asynchronous protocols do not depend on any predetermined time bound makes it impossible to achieve strongly fair validity^{2}^{2}2In fact, even the weakly fair validity cannot be achieved. We will elucidate it in Section III.. In this paper, we show the following impossibility result.
Theorem 1.
(informal, restated in Theorem 7) In an asynchronous network, no Byzantine agreement tolerating some Byzantine nodes can achieve agreement, termination and strongly fair validity at the same time.
Thus, our problem is whether we can have a secure synchronous BA that achieves fair validity while enjoys the advantages of asynchronous BAs as many as possible? The answer is positive. In the following, we introduce two desired properties for designing synchronous BA.
PartitionResilience
Algorand agreement proposed by Chen et al. [5] is a synchronous protocol. In their work, they propose a new property, called partitionresilience: a Byzantine agreement protocol is partitionresilient (PR) if the agreement always holds even if the network is asynchronous, and the termination holds if the network becomes synchronous and all the delayed messages delivered. Notice that “a synchronous BA with PR” is different from “an asynchronous BA.” For the former, the protocol is still parameterized by a timebound and some properties^{3}^{3}3In this paper, fair validity and responsiveness in our protocols depend on . other than the agreement may still rely on . On the other hand, an asynchronous BA performs qualitatively the same no matter the condition of the network.
The network nowadays in highly reliable, so a synchronous BA with PR enjoys all the desired properties depending on most of the time, while the agreement still holds even if the occasional failure happens. When applying to blockchains, the agreement guarantees that the chain will not fork. Thus, PR is a reasonable requirement of a BA protocol for building a blockchain.
Responsiveness
Recently, Pass and Shi [4] proposed a blockchain protocol, called Hybrid consensus, whose security depends on the a priori known upperbound while the protocol proceeds as soon as the actual network delay. In [4], they defined a performance metric called responsiveness: a protocol is called responsive if its termination time depends only on the actual network delay but not on the a priori known upperbound .
We borrow the same notion and apply it to Byzantine agreement. We say a BA protocol is responsive if all the honest nodes terminate on some values as fast as the actual network proceeds without depending on any predetermined time bound.
Weakly Fair Validity
In this work, however, we show that if a BA protocol only executes once, it is impossible to achieve both responsiveness and strongly fair validity. Hence, we define a weaker notion of fairness, called weakly fair validity, which captures the decided values when the BA protocol is executed many times. When applying to blockchains, BA is usually executed once for each block. Thus, weak fair validity is a reasonable metric if we examine the distribution of the proposers for a series of blocks. We will formally introduce and justify the definition in Section III.
Our Contributions
To sum up, this paper has two main contributions. First, we formalize the notion of fairness and analyze the relevant properties, including:

we define strongly fair validity, which states that every honest node’s value has a reasonable probability of being decided if the protocol is only executed once;

we define weakly fair validity, which lowerbounds the expected numbers that honest nodes’ values being decided if the protocol is executed many times;

we show that no BA protocol can achieve agreement, termination and weakly fair validity at the same time in an asynchronous network;

we show that no BA protocol can achieve both responsiveness and strongly fair validity even in synchronous network.
Second, we propose two partitionresilient BA protocols tolerating up to corruptions that achieve a different level of fairness. The first protocol, called RBA, achieves strongly fair validity, while the second protocol, called HBA, achieves both responsiveness and weakly fair validity. The two protocols not only justify the definition of fair validity but are also pragmatic and friendly for realworld implementation. If there is no partition, HBA terminates in in the worst case, in the average case and in the best case, where is the number of malicious users and is the actual network latency. In addition, only the predetermined proposer needs to propose the value, so the bandwidth complexity is low. Even if the predetermined proposer crashes, other users still can reach an agreement by the followed RBA. In this aspect, our protocol avoids the single point of failure and resists to the DDoS attack.
Let be the number of nodes joining the protocol and be the number of malicious nodes. Our work can be formally summarized as the following theorems.
Theorem 2.
Synchronous authenticated Byzantine agreement can achieve partitionresilience, strongly fair validity and optimal resilience with

in the best case, 5 rounds termination and communication,

expected 8 rounds termination and communication,

in the worst case, rounds termination and communication against an adaptive adversary.
Theorem 3.
Synchronous authenticated Byzantine agreement can achieve responsiveness, partitionresilience, weakly fair validity and optimal resilience with

in the best case, less than termination and communication,

expected less than rounds termination and communication,

in the worst case, rounds termination and communication against a adaptive adversary.
IB Related Work
To the best of our knowledge, only Abraham et al. [6] discussed the fairness in the context of BA. In that paper, they defined the quality of a BA: the probability of choosing a value that was proposed by an honest node is at least except with negligible probability.
Their definition [6] is not sufficient when the BA is applied to blockchains. The quality views all the honest nodes as a whole. There may be an honest node whose value is never accepted by other nodes, which is undesired in blockchains. On the contrary, both the strong and the weak fair validity in this paper characterize the behavior of each honest node.
Algorand agreement [5] inspires us to design a synchronous BA resisting to the network failure. In [5], they proposed a partitionresilient BA with leader election based on verifiable random functions. The main contribution of our protocol is that HBA further achieves responsiveness while remaining partitionresilience. Besides, a leader is elected for each iteration in Algorand’s design. On the contrary, our leader election procedure is independent of the iteration index, so the nodes are not required to propose their values at each iteration. As a result, Algorand’s BA only achieves probabilistic finality, while RBA and HBA both terminate in iterations in the worst case, where is the number of malicious nodes.
Therefore, without sacrificing security, HBA outdoes in the aspect of performance. In the best case, HBA terminates as fast as the actual network latency; in the worst case, HBA achieves deterministic finality.
Another important related work is practical Byzantine fault tolerance (PBFT) by Castro and Liskov [7]. The notion of responsiveness is emerging in their work [7], but it is formally defined in [4]. To achieve responsiveness, there is a specific node, the primary, that can be predicted for each view. We adopt the same method in HBA for the responsiveness.
When the primary does not follow the protocol, PBFT relies on view change to switch to the next predetermined primary. However, the predictable primaries are easy to be attacked, like DDoS. The attacker may compromise a series of primaries so that the protocol may halt for a long time. On the contrary, in HBA, when the primary^{4}^{4}4The predetermined node in HBA is called the pioneer. See Section V. is malicious and does not broadcast the valid messages, the honest nodes will initiate RBA, whose leader is selected by a verifiable random function. In this case, the attacker cannot predict who will be the leader, so the protocol terminates in the constant time in expectation. Precisely, when RBA is initiated, all the honest terminate on some values in .
Hybrid consensus [4] proposed by Pass and Shi is a responsive blockchain protocol, where the responsiveness relies on the underlying Byzantine fault tolerance (BFT) protocol. Briefly speaking, the participants of the underlying BFT is selected by the permissionless Nakamoto consensus since the consistency of blockchain guarantees that every honest party agrees on the same set of participants. Hence, HBA can also be adopted as the underlying BFT.
IC Technical Overview of Rba and Hba
In this paper, we propose two BA protocols. Both of them achieve partitionresilience and tolerate up to corruptions. The first protocol, robust Byzantine agreement (RBA), achieves the strongly fair validity. The second protocol, hybrid Byzantine agreement (HBA), achieves responsiveness and the weakly fair validity. In the following paragraphs, we highlight the insights on how these protocols achieve these properties. For convenience, we set the threshold of a supermajority to be out of total population , where is the number of Byzantine nodes.
Agreement
RBA is a leader based protocol. The leader is elected by the pseudorandom value of a verifiable random function, which we called a credential. At the beginning of RBA, each node proposes its value and the credential for being a leader.
Then, each iteration consists of two phases of voting: In the first phase voting, nodes identify the leader by comparing their credential and vote on the leader’s value. If an honest node receives a supermajority of votes for the same value, the node locks the value. In the second phase voting^{5}^{5}5In Section IV, we call the first phase voting precommit message and call the second phase voting commit message., the nodes vote for the locked values.
If a node locks on some value, it will always vote for the locked value in the following iterations, unless the locked value is updated. A node only locks one value and updates its locked value if it receives a supermajority of votes for the same value in the first phase in the future iteration.
A node terminates if a supermajority of votes for the same value in the second phase. This means that there is a supermajority of nodes locks on the value.
PartitionResilience
We design two mechanisms to achieve this. First, at any time, at most one value can be locked by a supermajority of nodes. Once the supermajority of node locks on a value, all honest nodes in the supermajority will only vote on the value for the first phase in the following iterations. Then, it is impossible that a new value will be locked. Hence, the honest nodes never decide on different values even if the partition exists.
Second, to ensure node can process in the same iteration even network partition sometimes happened, nodes will jump to the newest iteration if it receives a majority of votes in the first phase of that iteration. That is, each node will update the locking value not only by the timing bound from the synchronous network but also the condition of valid votes is received asynchronously to against network partition.
Responsiveness
For HBA, the predetermined leader (we called pioneer) mechanism is adopted. Each node can know who is the pioneer by some predetermined information before the protocol starts. In the first iteration of HBA, each node votes the value proposed by the pioneer immediately. If the pioneer is honest and the network operates normally, all the nodes simply decide on pioneer’s value. Otherwise, if no value is decided after the first iteration, every node starts RBA with the initial state inherit from the first iteration.
Since the pioneer is predetermined in the first iteration, each node decides on pioneer’s value as soon as the votes in the first and second phase are enough. In other words, the nodes work asynchronously in the first iteration, and thus, the latency only depends on the actual network instead of the upperbound. Note that, there are still two voting phases in case of a network partition.
Fair Validity
In RBA, every node follows the leader’s value, so we have to make sure every node has a reasonable probability of being chosen as the leader for the strongly fair validity. The leader is chosen by the credentials from each node. Thus, nodes have to wait for the worstcase network latency to ensure all the messages from honest nodes are received.
On the other hand, in HBA, every node follows the pioneer’s value to achieve the responsiveness, so other node’s value will not be decided if the pioneer and the network work normally. Thus, HBA only achieve weakly fair validity. To do this, the pioneer election is done by permutation. That is, there is a deterministic list for the order of pioneers (e.g., ranking by public key). Suppose there are nodes joining the protocol. In this case, the expected numbers that honest nodes’ values being decided are roughly after times of the protocol.
Optimal Resilience
The famous results by Dwork et al. [8] showed the impossibility of a permissioned consensus protocol even with public key infrastructure cannot tolerate 1/3 or more Byzantine corruptions in an asynchronous network. Conceptually, suppose nodes are divided into three distinct sets: , and , where the nodes in and are honest and the ndoes in are malicious. Due to the asynchronous network, the messages between and are delayed arbitrarily long. Without loss of generality, we assume and . If the protocol only needs nodes to proceed, the nodes in can send inconsistent messages to and . Then, the nodes in and may agree on the different values, respectively, so the agreement breaks. If the protocol needs more than nodes to proceeds, the nodes in just crash. Then, the protocol will halt forever and the termination breaks.
Thus, once the malicious users are more than , either the agreement or the termination breaks.
To achieve partitionresilience, both RBA and HBA tolerates corruptions, which is the optimal according to the argument above.
ID Roadmap
In Section II, we formalize our network and adversary models. In Section III, we define strongly fair validity and weakly fair validity, and we also give two relevant impossibility results. Then, the protocol and the security analysis of RBA and HBA are given in Section IV and Section V, respectively. We analyze the communication complexity of RBA and HBA in Section D. We implemented our protocols by Go language and deployed on Google Cloud Platform services. The experiment results are presented in Section VI. We also compare the performance of HBA and three other BA protocols under different network conditions by simulation in Section VII. Finally, the main contributions are concluded in Section VIII. The proofs in Section IV and Section V are given in Appendix B and Appendix C, respectively.
Ii Preliminaries
Iia System Model
In this paper, we consider the authenticated setting (i.e., digital signature and public key infrastructure (PKI) exist). We further assume that when the users register their public keys on the PKI, they cannot choose the key in favor of other users’ keys. In practice, this can be done by commitandreveal schemes. The users register the hash values of their public keys on the PKI first. After all the users have registered, they reveal the public keys.
We say the adversary is static if the adversary has to choose which nodes are corrupted before the protocol starts. On the contrary, we say the adversary is adaptive if the adversary can choose which nodes are corrupted during the protocol. The corrupted nodes are called Byzantine and the nodes that are not corrupted are called honest. A Byzantine node can deviate from the protocol arbitrarily; it can engage in problematic malfunctions such as sending conflicting messages, violating algorithm criteria, delaying the messages between other nodes, and so on. We also assume the adversary has full control of the network. The adversary can learn all the messages delivered on the network and determine the delay and the order of the delivered messages.
We say a network is synchronous if there exists a known time bound . We say the network is partitioned if the messages between the honest nodes are delayed such that the delivering time exceeds . A network is recovered from partition if all nodes receive all the previous messages which should be delivered and the delay of the message is lower than . We say a network is asynchronous if time bound doesn’t exist.
IiB Terminology
A function is negligible if for all polynomial , there exists an integer such that for all integers , it holds that .
Given a security parameter , a protocol and a protocol , we say is indistinguishable from , if for all polynomialtime distinguishers , there is a negligible function such that
IiC Verifiable Random Function
Iii Fairness
Iiia Definition
Let be the security parameter and be the number of total nodes joining Byzantine agreements, where each node starts with the initial value . Let be the set of honest nodes at the end of the Byzantine agreement.
Definition 4 (strongly fair validity).
A Byzantine agreement achieves strongly fair validity for a set of adversaries if for all adversaries in , there exists a negligible function such that for all , it holds that
(1) 
In practice, if we apply the Byzantine agreement to blockchains, the Byzantine agreement may be executed many times, once for each block. Hence, we propose a weaker version of fairness, called weakly fair validity. Intuitively, it guarantees the lowerbound of the expected numbers that honest nodes’ values being decided if the protocol is executed many times.
Definition 5 (weakly fair validity).
Suppose we execute a Byzantine agreement times. Let be a binary random variable such that if ’s initial value is decided by some honest node in th time; otherwise, . Let be the set of honest nodes when the Byzantine agreement has to be executed times. Then, we say the Byzantine agreement achieves weakly fair validity for a set of adversaries if, for all adversaries in , there exists a negligible function such that for all , it holds that
(2) 
where .
Obviously, a BA with strongly fair validity must be a BA with weakly fair validity.
IiiB Impossibility of fair Byzantine agreements
In this section, we prove two impossibilities of fair Byzantine agreements.
Theorem 6.
In an asynchronous network, no Byzantine agreement tolerating some Byzantine nodes can achieve agreement, termination and weakly fair validity at the same time.
Proof.
Let be the number of total nodes joining Byzantine agreements. We divide nodes into two sets: nodes in the first set and one node in the second set . Due to the asynchronous network, the delay between two sets can be arbitrarily long while the messages delivered in the same set arrive immediately. In this case, the nodes in cannot distinguish whether the node in is Byzantine or the network is partitioned. If the nodes in wait for the messages from , the termination fails because the node in may be Byzantine. If the nodes in do not wait for the messages from , the weakly fair validity fails because the initial value of the node in will not be considered in the protocol. ∎
Theorem 6 rules out the possibility that constructing an asynchronous Byzantine agreement to achieve fairness. That is why we construct RBA in section IV to achieve fairness.
Since the latency of typical synchronous Byzantine agreements is a multiple of the worstcase bound of network latency. The latency of fair Byzantine agreements is also bounded by the worstcase network latency. Can we construct a responsive Byzantine agreement that achieves fairness? We prove the impossibility of this question in next theorem.
Theorem 7.
Responsive synchronous Byzantine agreements cannot achieve strongly fair validity.
Proof.
We prove this theorem by contradiction. Assume a responsive synchronous Byzantine agreement achieves strongly fair validity. Let be an honest node in the responsive synchronous Byzantine agreement and the latency of message sent from to other nodes be always in , where is an arbitrary positive number less than . If the decided time be in , the message of the honest node has zero probability of being decided by the responsive Byzantine agreement. This violates the definition of strongly fair validity. Otherwise, the decided time is larger than . This violates the definition of responsiveness. ∎
Thus, we construct a responsive Byzantine agreement HBA to achieve weakly fair in section V as an example that a responsive Byzantine agreement can achieve weakly fairness.
Iv Robust Byzantine Agreement
In this section, we introduce our first Byzantine agreement protocol with partitionresilience, strongly fair validity tolerating up to corruptions. We call the protocol robust Byzantine agreement (RBA).
Iva Protocol
Let be the set of all nodes and be the size of . Let and denote the set of values that can be decided. We also define two special values and SKIP that are not in . For each node , has four internal variables: records the index of the iteration at which is working, records the candidate value that supports, records the index of the iteration from which comes and is ’s local clock. Let and denote the secret key and public key of , respectively.
Let be a verifiable random function (VRF). We write
to denote the output of on the message with the secret key , where is the pseudorandom value and is the proof for . We define status to be the predetermined public information, for example, the public key of all nodes in BA or the height of blocks in blockchain.
Message types
We define three kinds of messages:

the initial message of the node : , where

the precommit message of the value from the node at the iteration :

the commit message of the value from the node at the iteration :
We assume all messages are protected by the digital signature, so the authentication of the messages hold.
Leader election
With these notations, we introduce the leader election algorithm which will be a subroutine of RBA. Let denote the set of initial messages that the node receives from all the nodes (including itself). The node verifies the VRF value in and sets to be the set of nodes whose VRF values are valid. Then, computes
We say is the leader of .
Updating internal variables
Suppose a node is working at iteration . The node updates its internal variables as soon as one of the following conditions holds:

(lock condition) If node has seen precommit messages of the same value at the same iteration such that , sets and .

(forward condition) If node has seen precommit messages of the same value at the same iteration such that , sets and starts the iteration from Step 2.

(forward condition) If the node has seen commit messages of any value at the same iteration such that , sets and starts the iteration from Step 2.
We say that node achieves the lock condition, if the condition 1 holds. We say that node achieves the forward condition if the condition 2 or the condition 3 holds. Node goes into the next iteration immediately if it achieves the forward condition even if it does not achieve the forward condition at Step 4.
Protocol description
RBA (Algorithm 1) is an iterationbased protocol. Initially, for all honest nodes , initializes its internal variables by , , and and also chooses its initial value .
Our protocol has four steps in each iteration. At Step 1, all the nodes broadcast their own initial value in the format .
When , enters Step 2. If , verifies the initial messages it receives and computes the set of nodes whose VRF values are valid. If , identifies its leader and precommits ’s value; otherwise, precommits . If , node precommits . We say node precommits on a value if node broadcasts the message where is the iteration index that node is working at. Note that node updates its and immediately if the lock condition holds.
When , node enters Step 3. Node commits on its current . We say node commits on a value if node broadcasts the message where is the iteration index that is working at. After node broadcasts the commit message, enters Step 4, at which waits for the forward conditions.
Termination condition
Node decides on a value as soon as node has seen commit messages of the same value at the same iteration .
The protocol for a node is summarized as Algorithm 1.
IvB Agreement
We first show that our protocol will reach agreement; that is, two honest nodes never decide on the different values.
Lemma 8.
Assume . Suppose a node receives commit messages of and another node receives commit messages of . If both these commit messages all come from the iteration , then .
Theorem 9 (Agreement).
Assume and the adversary is adaptive. Regardless of partition, if an honest node decides on some value and another honest node decides on some value , then . That is, the honest nodes will never decide on different values.
IvC Termination
We now analyze when the algorithm terminates if no partition exists or if the system recovers from a previous partition in different adversary model.
Proposition 10 (Termination without partition in adaptive adversary).
Assume and the adversary is adaptive. If all the honest nodes start at the th iteration within time and no partition exists, all the honest nodes will decide on some values in iterations.
Note that the honest nodes will broadcast commit messages when . If all the honest nodes start RBA simultaneously, they will all receive the commit messages when . However, we allow that any two honest nodes start the protocol with at most time difference. Thus, the node who start the protocol earliest will receive the commit messages when its local time . That is, the best termination time of RBA is .
As we shown in Proposition 10, all the honest nodes will terminate in iterations with certainty. According to the forward conditions, nodes start the th iteration from for all . Hence, the first iteration costs to complete, but the following iterations only cost for each. Thus, the iterations cost in the worst case.
Proposition 11 (Expected termination in static adversary).
Assume and the adversary is static. Suppose all honest nodes start at th iteration within time and no partition exists. Then, it is expected that all honest nodes will decide on some values in rounds.
Proposition 12 (Fast recovery from partition in adaptive adversary).
Assume and the adversary is adaptive. If the partition is resolved, all the honest nodes will decide on some values in iteration. If the adversary is static, it is to be expected that all honest nodes will decide on some values in rounds.
IvD Strongly Fair Validity
In this section, we prove that RBA achieves strongly fair validity. Intuitively, when the network operates synchronously, every honest nodes receives all the initial messages from each other before
. Then, as long as the underlying VRF is secure, the probability of being the leader is approximate the uniform distribution, so
RBA achieves strongly fair validity. The result is formalized as the following theorem.Theorem 13 (strongly fair validity).
Suppose the network is synchronous and is a secure VRF. Then, RBA achieves strongly fair validity under the assumption of static adversary.
V Hybrid Byzantine Agreement
In the previous section, the leader is selected by the lowest value of VRF in RBA. This limits the performance of RBA since nodes have to wait for the worstcase network latency to ensure all the messages from honest nodes are received. In this section, we improve the efficiency by hybridizing of RBA and the predetermined leader method. Intuitively, our protocol consists of two phases. At the beginning of the protocol, all the nodes can uniquely identify a particular node, called pioneer, in a deterministic way. In the first phase, the pioneer broadcast its value and other nodes precommit on pioneer’s value as soon as possible. If the pioneer does not propose a value, other nodes start RBA, if timeouts.
Due to the hybrid structure, we call the improved BA in this section hybrid Byzantine agreement (HBA). Note that, since there is only one pioneer, each node can decide when enough votes are received instead of waiting for the worstcase network latency. Thus, this technique achieves responsiveness.
Va Protocol
Now we formally introduce HBA() protocol, where is the parameter related to the pioneer election in following part.
Message types
Except for initial message, precommit message and commit message, we need the fourth type of message in HBA:

the fast message of the pioneer node :
Pioneer election
Let be the node by sorting all the user according to their public key. The pioneer is node , where is the parameter of HBA protocol.
Leader election
The leader election of HBA is the same as RBA.
Updating internal variables
The conditions for updating internal variables in HBA are almost the same in RBA, except that the forward conditions start the next iteration from Step 4 and . Suppose a node is working at iteration . The node updates its internal variables as soon as one of the following conditions holds:

(lock condition) If node has seen precommit messages of the same value at the same iteration such that , sets and .

(forward condition) If node has seen precommit messages of the same value at the same iteration such that , sets and starts the iteration from Step 4.

(forward condition) If the node has seen commit messages of any value at the same iteration such that , sets and starts the iteration from Step 4.
Protocol description
Before Step 1, the pioneer can be uniquely determined by all the nodes according to the pioneer election.
At Step 1, the pioneer broadcast its own initial value in the format . For every nonpioneer node , starts HBA from Step 2. At Step 2, the pioneer broadcasts the precommit message of its value. For every node , when receives pioneer’s fast message, broadcasts the precommit message immediately if .
For every node , if receives precommit messages of the same value and , updates the variables and (according to the lock condition) and broadcasts the commit message immediately. Note that the honest nodes broadcast the precommit and commit messages as soon as the conditions are satisfied. It is the core idea to achieve the responsiveness.
Step 3 to Step 6 is a RBA protocol, except that the internal variables and may be changed in Step 2.
Termination condition
The termination condition in HBA is the same as RBA: Node decides on a value as soon as node has seen commit messages of the same value at the same iteration .
The protocol for a node is summarized as Algorithm 2.
VB Agreement
Theorem 14 (Agreement of Hba).
Assume and the adversary is adaptive. Regardless of partition, if an honest node decides on some value and another honest node decides on some value , then . That is, the honest nodes will never decide on different values.
VC Termination
Proposition 15 (Termination without partition in static adversary).
Assume and the adversary is static. If all the honest nodes start HBA within time and no partition exists, all the honest nodes will decide on some values in , and iterations in the best case, the average case and the worst case, respectively.
Proposition 16 (Termination without partition in adaptive adversary).
Assume and the adversary is adaptive. If all the honest nodes start HBA within time and no partition exists, all the honest nodes will decide on some values in iterations.
Proposition 17 (Fast recovery from a partition in adaptive adversary).
Assume and the adversary is adaptive. If the partition is resolved, all the honest nodes will decide on some values in iterations. If the adversary is static, it is to be expected that all honest nodes will decide on some values in rounds.
VD Responsiveness
The following proposition directly implies that HBA is responsive.
Proposition 18.
Assume the actual network delay is , and all the nodes start HBA within time difference. If there is no partition and the pioneer is honest, all the honest nodes will decide on some values in .
VE Weakly Fair Validity
Theorem 19 (weakly fair validity).
Suppose the network is synchronous. Then, HBA achieves weakly fair validity under the assumption of static adversary.
Vi Experiment
We implemented our protocol by Go language and deployed on Google Cloud Platform services. We ran RBA and HBA on 21 GCP instances (4 vCPU and 8GB RAM) uniformly distributed throughout its 10 regions spanning 3 continents.
We set second by the reason of our experiment on the latency on GCP. However, it is optimistic to set this bound for a general network. For RBA , the experiment repeats 703 times and the histogram is shown in Figure. 1. The average latency of RBA
is 3.20 seconds and the standard deviation is 87.60 ms. The results show the latency of
RBA is expected 6.4 round, which is close to the round complexity of the best case.For HBA , the experiment repeats 715 times and the histogram is shown in Figure. 2. This result confirms the responsiveness of HBA . The average latency of HBA is 241.79 ms and the standard deviation is 83.06 ms.
Vii Simulation
To demonstrate the performance, responsiveness, and partitionresilience of HBA, we implement three other Byzantine agreements and compare the simulation results. The three protocols are PBFT [7], the synchronous BA proposed by Abraham et al. (ADD+19) [10] (the version against static adversary), and Algorand agreement [5].
Let be the number of nodes, be the number of maximum faulty nodes, and be the predefined maximum network delay for the protocols. We implement a network module that each node connected to. The actual network delay is parametrized by
where the delay is sampled from a Gaussian distribution with mean
and standard deviation .The number of messages sent and the latency during a Byzantine agreement process is recorded from the first message sent to the last node decides its value. Note that we do not have any faulty node in this experiment. We run the experiment on a MacBook Pro with 2.6GHz 6core Intel Core i7, but the latency is calculated by a simulation clock instead of a wall clock or CPU time, so the result should be able to be reproduced on any machine specification. Means and standard deviations from each result are calculated from 100 times of simulation.
We conduct two experiments to show the behaviors of different protocols under different network conditions.
Responsiveness
In the first experiment, all the network delays are sampled from . We execute the four protocols under different (400ms, 1000ms, 2000ms) and the result is shown in Figure 3.
From Figure 3, we can see that the confirmation time of BAs with responsiveness such as HBA and PBFT only depend on the actual network latency. Thus, the confirmation time does not change when varies. On the other hand, the confirmation time of synchronous BA without responsiveness such as ADD+19 and Algorand agreement increases as increases. The ratio between confirmation time and is the number of total rounds. It costs around 6.2 rounds and 2.2 rounds for ADD+19 and Algorand agreement, respectively.
PartitionResilience
In the second experiment, the network operates in two modes: the normal mode and the partition mode. In the normal mode, all the nodes are connected with the delay sampled from . In the partition mode, the network is divided into three distinct sets of size or . Within the set, the delay is sampled from . For the messages between two sets, the delays are sampled from . All the protocols are executed with . Thus, when the network is in the partition mode, the delay between different sets exceeds . The protocols are executed in the partition mode for 60 seconds. Then, the network becomes the normal mode. The result is shown in Figure 4.
Notice that the partition is “benign” in this model. Except that the delays are sampled from , there is no adversary that reschedules or delay the messages to break the protocols maliciously. The benign partition captures the case that the Internet cables breaks so that the alternative route is saturated.
In this experiment, the agreement holds for all the protocols. From Figure 4, we can see that all four protocols terminate successfully. In particular, HBA and PBFT terminates before the network is recovered (at 60 seconds).
Concretely speaking, when running HBA, the honest node updates when it receives precommit messages at the current iteration. In other words, as long as the lock condition is triggered before the forward condition, the honest node will update and broadcast the commit message of in the next iteration. Then, honest nodes terminate when they receive commit messages.
In order to prevent honest nodes from termination by delaying messages, the adversary needs to trigger the forward condition before the lock condition. However, such condition rarely happens in practice if the network is not manipulated maliciously.
As for PBFT, the timeout scales up when the view change happens, so once the timeout exceeds the delay, the protocol terminates. For ADD+19, the protocol is design for the synchronous network, and the partitionresilience is not claimed in their paper, but the protocol terminates after the partition is resolved ^{6}^{6}6When the messages are delayed maliciously, the agreement of ADD+19 may be broken. However, such a network condition is beyond their assumption.. As Algorand claimed, the protocol terminates immediately after the network is recovered.
Bandwidth Usage
Finally, we give a short remark to the bandwidth usages. The numbers of messages are highly related to bandwidth usages. From Figure 4, the numbers of messages are similar for HBA and PBFT under different participating nodes. The numbers of messages of ADD+19 and Algorand are more than 69% and 330% larger than HBA for any setting, respectively.
Viii Conclusion
In this paper, we figure out what counts a suitable BA for blockchains and give the concrete constructions that achieve the properties. We discuss three desired properties from the aspects of incentive model, security and performance. The first property is fair validity, and we prove two impossibilities: any BA cannot achieve weakly fair validity in the asynchronous network, and any responsive BA cannot achieve strongly fair validity. The second property is partitionresilience because the realworld internet is sometimes unstable or attacked by adversaries. The third property is responsiveness because the latency is usually limited by the time bounds of the synchronous BAs.
We also give two constructions, RBA and HBA, to demonstrate these properties. The first protocol, RBA, achieves strongly fair validity and partitionresilience. Based on RBA, the second protocol, HBA, achieves weakly fair validity, partitionresilience, and responsiveness. Moreover, comparing to PBFT, HBA enjoys a better resistance to DDoS and better latency in the network partition. With these properties, HBA strikes a balance between fairness, security, and performance.
References
 [1] L. Lamport, R. Shostak, and M. Pease, “The byzantine generals problem,” ACM Transactions on Programming Languages and Systems, vol. 4, pp. 382–401, jul 1982.
 [2] M. J. Fischer, “The consensus problem in unreliable distributed systems (a brief survey),” in Foundations of Computation Theory, pp. 127–140, Springer Berlin Heidelberg, 1983.
 [3] J. Chen and S. Micali, “Algorand,” CoRR, vol. abs/1607.01341v9, 2016.
 [4] R. Pass and E. Shi, “Hybrid consensus: Efficient consensus in the permissionless model,” in 31st International Symposium on Distributed Computing, DISC 2017, October 1620, 2017, Vienna, Austria (A. W. Richa, ed.), vol. 91 of LIPIcs, pp. 39:1–39:16, Schloss Dagstuhl  LeibnizZentrum fuer Informatik, 2017.
 [5] J. Chen, S. Gorbunov, S. Micali, and G. Vlachos, “Algorand agreement: Super fast and partition resilient byzantine agreement,” IACR Cryptology ePrint Archive, vol. 2018, p. 377, 2018.
 [6] I. Abraham, D. Malkhi, and A. Spiegelman, “Validated asynchronous byzantine agreement with optimal resilience and asymptotically optimal time and word communication,” 2018.
 [7] M. Castro and B. Liskov, “Practical byzantine fault tolerance,” in Proceedings of the Third Symposium on Operating Systems Design and Implementation, OSDI ’99, (Berkeley, CA, USA), pp. 173–186, USENIX Association, 1999.
 [8] C. Dwork, N. A. Lynch, and L. J. Stockmeyer, “Consensus in the presence of partial synchrony (preliminary version),” in Proceedings of the Third Annual ACM Symposium on Principles of Distributed Computing, Vancouver, B. C., Canada, August 2729, 1984, pp. 103–118, 1984.
 [9] S. Micali, M. Rabin, and S. Vadhan, “Verifiable random functions,” in Proceedings of the 40th Annual Symposium on the Foundations of Computer Science, (New York, NY), pp. 120–130, IEEE, October 1999.
 [10] I. Abraham, S. Devadas, D. Dolev, K. Nayak, and L. Ren, “Synchronous byzantine agreement with expected O(1) rounds, expected O(n) communication, and optimal resilience,” in 23rd International Conference on Financial Cryptography and Data Security, FC’19, pp. 429–445, 2019.
Appendix A Definition of VRF
The definition is paraphrased from [9].
Definition 20 (verifiable random function).
Let be a 3tuple polynomialtime algorithm, where

KeyGen takes as input a security parameter and outputs a pair of key .

Prove takes as input a seed and a secret key ; it outputs a value and a proof .

Veri takes as input ; it verifies whether by using the proof and key .
Let and be any functions such that and are computable in time . We say is a verifiable random function with input length and output length if the following properties hold:

Correctness. If , then

Uniqueness. For every such that , the following holds for either or :

Pseudorandomness. (Sketched) Any probabilistic polynomial time adversary cannot distinguish the output of a VRF from a uniform random variable.
Intuitively, pseudorandomness requires that the output of a VRF should be indistinguishable from a string sampled from a uniform distribution.
Appendix B Proof in Rba
Ba Agreement
Lemma.
Assume . Suppose a node receives commit messages of and another node receives commit messages of . If both these commit messages all come from the iteration , then .
Proof of lemma 8.
We prove this lemma by contradiction. Suppose . Because as many as Byzantine nodes exist, there exists at least one honest node that both commits on and by the pigeonhole principle. However, honest nodes can only commit on one value at one iteration, which leads to a contradiction. ∎
Theorem (Agreement).
Assume and the adversary is adaptive. Regardless of partition, if an honest node decides on some value and another honest node decides on some value , then . That is, the honest nodes will never decide on different values.
Proof of theorem 9.
Because decides on and decides on , and must see commit messages of and commit messages of , respectively. Suppose both these commit messages come from the same iteration . By Lemma 8, we have .
Suppose the commit messages that receives come from the iteration and the commit messages that receives come from the iteration . Without loss of generality, we assume . Because there are up to Byzantine nodes, there must be at least honest nodes commit on so that can receive commit messages of . For all iterations , these honest nodes will always precommit on until they see precommit messages of . However, only nodes remain, so these honest nodes will never precommit any for all . Thus, for all , if some value has precommit messages, then .
Because receives commit messages of , there must exist at least honest nodes that commit on at the iteration . These honest nodes commit on only if they have seen precommit messages of at iteration . Therefore, . ∎
BB Termination
Proposition (Termination without partition in adaptive adversary).
Assume and the adversary is adaptive. If all the honest nodes start at the th iteration within time and no partition exists, all the honest nodes will decide on some values in iterations.
Proof of proposition 10.
In this proof, we divide all the possibilities into three cases. First, we suppose there is an honest node has decided on some value. Second, we suppose that no honest node has decided, but there exists an honest node has seen precommit messages of the same value. The third case includes all the else possibilities.
Case 1: Some honest node has decided. If an honest node has decided on the value , must have seen commit messages of . Because propagates these commit messages, all the honest nodes will hold this information after time and decide on in one iteration.
Case 2: Some honest node has seen precommit messages on the same value. Suppose no node has decided but there exists an honest node that has seen precommit messages of a value . Because propagates these precommit messages, all the honest nodes will hold this information after time . With these precommit messages, all the honest nodes update their internal variables according to the condition . Consequently, all the honest nodes will precommit on at the next iteration and thus commit on as well. At the end of the next iteration, they will all decide on .
Case 3: Else possibilities. Because no honest node has ever seen precommit messages, for all honest node . Thus, they will identify their leader by their local view. Because all honest nodes start at the th iteration within time , they can receive all the initial values from other honest nodes before identifying the leaders. Thus, there exist some honest nodes that precommit different values relative to each other only if a Byzantine node proposes different initial values to different nodes^{7}^{7}7Note that not proposing any initial value is considered to be equivalent to proposing .. However, the honest nodes will propagate the initial value so all honest nodes will have the same set of initial values after time . Thus, to prevent the honest nodes from agreeing on the same leader, Byzantine nodes must propose different initial values to different nodes at every iteration. However, a node can only propose an initial value once, or it will be caught. Thus, the best strategy of Byzantine nodes is that different Byzantine nodes propose their initial values at different iterations so Byzantine nodes can only interfere during iterations. Thus, all the honest nodes will decide on some values in iterations with certainty. ∎
Proposition (Expected termination in static adversary).
Assume and the adversary is static. Suppose all honest nodes start at th iteration within time and no partition exists. Then, it is expected that all honest nodes will decide on some values in rounds.
Proof of proposition 11.
From the proof of Proposition 10, we know that if some honest node has decided on value or has seen precommit messages of a value , then all the honest nodes will decide on in one iteration.
In a network without partition, the best strategy for the Byzantine nodes has been described in Case 3 in the proof of Proposition 10. However, to interfere with iterations successfully, the Byzantine nodes must win the leadership in the following iterations. The probability of such an event is
Thus, in expectation, the number of rounds can be computed by
Because , the expected number of rounds is 8. ∎
Proposition (Fast recovery from partition in adaptive adversary).
Assume and the adversary is adaptive. If the partition is resolved, all the honest nodes will decide on some values in iteration. If the adversary is static, it is to be expected that all honest nodes will decide on some values in rounds.
Proof of proposition 12.
If there exists a node that has decided on a value , must have seen commit messages of . All the honest nodes will receive these commit messages of within time after the partition is resolved and decide on .
Suppose no node has decided and is the node working on the latest iteration . To enter the iteration , must achieve the forward condition at iteration . Because the partition is resolved, all honest nodes will also achieve the forward condition within time after the partition is resolved and also enter the iteration . Later on, if some node achieves the forward condition and enters the iteration , other honest nodes will also achieve the forward condition within time . Thus, all honest nodes start at the iteration with time difference and Proposition 10 guarantees that they will decide on some values within the following iterations.
Because each iteration costs , similarly, if the adversary is static, it is to be expected that all honest nodes will decide on some values in rounds according to Proposition 11. ∎
BC Strongly Fair Validity
We first show that the probability is exactly lowerbounded by the uniform distribution in the ideal world. Then, we show that RBA works the same as the ideal world except the negligible probability.
We define the VRF oracle, consisting of two algorithm: and . is defined as:

Take as input a seed and a secret key .

Return .
is defined as:

Take as input a public key , a seed , a value and a proof .

Return .
We also define the ideal functionality of VRF, consisting of two algorithm: and . is defined as:

Takes as input a seed and a secret key .

Check whether is defined. If not, choose and uniformly at random. Then, set . If is defined, return .
is defined as:

Takes as input a public key , a seed , a value and a proof .

Check whether is defined. If not, return false; otherwise, return true.
In the ideal world, all the nodes does not compute and verify the value of VRF locally. Instead, they query the oracle and . All the else operations are the same as RBA.
Lemma 21 (fairness in the ideal world).
Suppose the network is synchronous. Then, in the ideal world, for all adversaries and for all , conditioned on all the honest nodes have decided on some values, it holds that
(3) 
Proof of lemma 21.
Because the VRF value are chosen uniformly at random for all nodes in the ideal world, the probability that the node wins the minimum value among (the leadership) is exact .
Once the node wins the leadership, all the honest nodes will broadcast the precommit messages on at and broadcast the commit messages on at because the network is synchronous. In this case, ’s value will be decided by all hones nodes. ∎
Theorem (strongly fair validity).
Suppose the network is synchronous and is a secure VRF. Then, RBA achieves strongly fair validity under the assumption of static adversary.
Proof of theorem 13.
We prove it by the hybrid argument. Let be the protocol the same as the ideal world except that the node queries and instead of and , respectively. Then, for all , let be the protocol the same as except that the node queries and instead of and , respectively.
Because is a secure VRF, the behavior of is indistinguishable from . Thus, the ideal world is indistinguishable from . Similarly, for all , is indistinguishable from . Because is bounded by , the ideal world is indistinguishable from .
Then, the honest nodes in RBA always compute VRF correctly. So, there is no different for the honest nodes that whether the VRF is computed locally or is queried by . Therefore, is indistinguishable from RBA.
Combining the arguments above, we have that the ideal world is indistinguishable from RBA. That is, there exists a negligible function such that RBA works the same as the ideal world except the negligible probability . Let be the event that ’s value is the decided by some honest node conditioned on RBA works the same as the ideal world for the node . Let be the event that ’s value is the decided by some honest node conditioned on RBA does not work the same as the ideal world for the node . Combine the result with Lemma 21, we have that in RBA, for all ,
(4)  
(5)  
(6) 
Thus, RBA achieves strongly fair validity. ∎
Appendix C Proof in Hba
Ca Agreement
Theorem (Agreement of Hba).
Assume and the adversary is adaptive. Regardless of partition, if an honest node decides on some value and another honest node decides on some value , then . That is, the honest nodes will never decide on different values.
Proof of theorem 14.
The proof is almost the same as the proof of Theorem 9. For completeness, we state the formal proof here. We call the commit message with the timestamp (sent in Step 2) comes from the iteration . Hence, for each iteration, an honest node can only commit on one value.
Because decides on and decides on , and must see commit messages of and commit messages of , respectively. Suppose both these commit messages come from the same iteration . According to the proof of Lemma 8, we have .
Suppose the commit messages that receives come from the iteration and the commit messages that receives come from the iteration . Without loss of generality, we assume . Because there are up to Byzantine nodes, there must be at least honest nodes commit on so that can receive commit messages of . For all iterations , these honest nodes will always precommit on until they see precommit messages of . However, only nodes remain, so these honest nodes will never precommit any for all . Thus, for all , if some value has precommit messages, then .
Because receives commit messages of , there must exist at least honest nodes that commit on at the iteration . These honest nodes commit on only if they have seen precommit messages of at iteration . Therefore, .
∎
CB Termination
Proposition (Termination without partition in static adversary).
Assume and the adversary is static. If all the honest nodes start HBA within time and no partition exists, all the honest nodes will decide on some values in , and iterations in the best case, the average case and the worst case, respectively.
Proof of proposition 15.
We categorize into two cases:
Case 1: The pioneer is an honest nodes.
Because the leader is honest, it will broadcast the message at the beginning of HBA.
All the honest nodes receive the leader’s value and reply in .
Then, all the honest nodes receive precommit message in .
Meanwhile, they broadcast the commit messages on .
Thus, all the honest nodes receive commit messages on and terminate within , which is the best case.
Case 2: The pioneer is a Byzantine node. By theorem 14, all honest nodes decide either in Step 2 or in Step 46 of some iteration. In the former case, all honest will reach the termination condition within . In the latter case, all honest nodes terminate in iterations in the worst case according to Proposition 10.
In expectation, they terminate in rounds according to Proposition 11. Since the probability of the leader in fast phase is honest node is , the expected time of termination is
∎
Proposition (Termination without partition in adaptive adversary).
Assume and the adversary is adaptive. If all the honest nodes start HBA within time and no partition exists, all the honest nodes will decide on some values in iterations.
Proof of proposition 16.
Proposition (Fast recovery from a partition in adaptive adversary).
Assume and the adversary is adaptive. If the partition is resolved, all the honest nodes will decide on some values in iterations. If the adversary is static, it is to be expected that all honest nodes will decide on some values in rounds.
Proof of proposition 17.
Suppose some honest nodes have decided in the fast mode. Then, after the partition is resolved, they would broadcast the proof, and all honest nodes will terminate and agree on the value proposed in the fast mode in . If no honest node has decided in the fast mode, then all the honest nodes proceed to the normal mode. In this case, the termination property is exactly the same as RBA and we have proved it in Proposition 12. ∎
CC Responsiveness
Proposition.
Assume the actual network delay is , and all the nodes start HBA within time difference. If there is no partition and the pioneer is honest, all the honest nodes will decide on some values in .
Proof of proposition 18.
Suppose all the honest nodes start HBA simultaneously. The honest pioneer broadcasts its value at . All the honest nodes will receive pioneer’s value and reply the precommit messages in . All the honest nodes will receive precommit messages before . Because the pioneer is honest, these messages all precommit on the same value. Thus, all the honest nodes broadcast the commit messages and will decide in .
If the pioneer broadcasts its value at for some node due to the time difference, will decide at . ∎
CD Wearkly Fair Validity
Theorem (weakly fair validity).
Suppose the network is synchronous. Then, HBA achieves weakly fair validity under the assumption of static adversary.
Proof of theorem 19.
When the node is elected as the pioneer, because the network is synchronous, all the honest nodes will receive ’s fast message and broadcast the precommit messages on before (we allow honest nodes start the protocol within time drift). Then, all the honest nodes will receive precommit messages on before , so they all set their on . In this case, they will all terminates on . Thus, as long as the network is synchronous, honest nodes will always terminates on honest pioneer’s value.
Because the pioneer is elected by the permutation of nodes’ public keys, all the nodes will be the pioneer once if HBA is executed times. Except that the adversary can forge the signature (only with negligible probability), all the honest nodes can propose a value that be decided by all honest nodes at least