Consensus is a fundamental problem in distributed systems. Historically, consensus protocols have been critical in the context of ensuring the consistency of replicated data [CGR07, CWO11, BGS11], but they were typically deployed with only a few dozen replicas and only tolerated crash failures. More recently, consensus protocols have been studied in the context of cryptocurrencies to maintain a distributed public ledger. These applications introduce new demands: First, cryptocurrency networks operate with thousands or millions of participants (large ), meaning communication complexity is unacceptable. Second, these ledgers support billions of dollars of economic activity, so they need to cope with a much stronger potential attacker.
Recent work addresses this goal of consensus with subquadratric communication complexity while tolerating adaptive adversaries, but these works require strong additional assumptions: Nakamoto’s elegant longest chain protocol [N08] relies on idealized proof-of-work, which has led to energy-intensive mining. Algorand [GHM17] and Ouroborous Praos [DGKR18] require honest users to erase their private keys from memory before sending a message, known as the memory-erasure model, which can be difficult to ensure in practice. [CPS19] uses a primitive called batch agreement which puts semantic requirements on agreement values, meaning it is impractical to use in the cryptocurrency context. In light of these restrictions, we seek to answer the following question:
What communication-efficient consensus protocols secure against adaptive adversaries can we obtain without strong cryptographic assumptions, and what are the limitations to obtaining these protocols?
Addressing this in even a synchronous network is challenging because most known communication-efficient protocols use committee election; proposals and voting are done by a leader and small committee which are elected uniformly at random. Typically the size is much smaller than the tolerated number of faults, so an adaptive adversary can simply corrupt the leader and entire committee, and vote for two values: we call this key reuse. Memory-erasure is one technique to eliminate key reuse; another is vote-specific eligibility where election is dependent probabilistically on the proposed value, so the adversary cannot force a compromised leader and committee to vote for another value (with high probability). Unfortunately, in these protocols the adversary can use computational power to bias the elections: we call this vote grinding. In the case of public ledgers, honest replicas will only propose transactions sent to them by clients, which means that honest replicas do not have a disproportionate chance to become part of the committee. The adversary, on the other hand, can create and try many different arbitrary transactions, for example spending coins back to itself, to increase the chances that Byzantine replicas are elected to committees.
Our solution is to make key reuse expensive by using Verifiable Delay Functions (VDFs) [BBBF18] to make it temporally expensive to send multiple votes. Leaders and committees are elected in such a way that there is no opportunity for vote grinding. Note that VDFs are proofs of sequential computation, meaning participants do not benefit from having parallel computational resources; this is quite different from proofs-of-work.
Extending these protocols to operate in a partially synchronous network (where a network is asynchronous until an unknown, but finite global stabilization time is reached) introduces new challenges. During a long enough asynchronous period the adversary can drop messages and force repeated elections until eventually a long sequence of leaders and committees are elected where the adversary has an advantage. We call this attack fast-forwarding. At this point the adversary can propose and vote for multiple proposals, violating safety. This is the reason why, even for our linear multicast protocol in Appendix B, we require a bound on the number of rounds the asynchronous period can last. This, among other reasons, is the motivation for developing a new partially synchronous model and a protocol in the modified partially synchronous network in Section 7.
1.1 Summary of Contributions
In this section, we describe the main results we present in this paper.
Main Result 1: Limitations of Communication-Efficient Protocols using Multicast (Section 5).
Thus far, most currently known and implemented communication-efficient protocols (e.g. [ACD19, CPS19, GHM17]) have all honest replicas communicate with other replicas in the network via multicasts (or broadcasts). In other words, every message that an honest replica sends is broadcasted to all other replicas in the network. First, we show that in such protocols where honest replicas only multicast messages (and do not perform point-to-point communication), it is impossible to achieve a communication-efficient protocol even under static adversaries in the synchronous model where safety is always guaranteed. We prove our result for binary Byzantine Agreement (BBA), which means it also holds for consensus.
Theorem (informal). It is impossible to formulate a communication-efficient protocol for binary Byzantine agreement that always guarantees safety (safety is guaranteed with probability ) while tolerating even a static adversary in the synchronous network model, when honest replicas multicast messages.
Ideally we could maintain communication-efficiency with high probability (or liveness in polylogarithmic rounds with high probability) but always maintain safety with probability . We hope our impossibility result might motivate researchers to investigate communication-efficient protocols which do not require all honest nodes to multicast all messages. An important open question is whether using point-to-point messages can lead to communication-efficient protocols where safety is always maintained or whether this impossibility result also extends to protocols which use point-to-point communication.
Second, we extend an impossibility result given in [ACD19] to show that it is impossible to formulate a communication-efficient binary Byzantine agreement protocol that achieves agreement with high probability in the partially synchronous model (with global stabilization times) as defined in [DLS88], even with synchronous processors (only message delays are asynchronous).
Theorem (informal). It is impossible to formulate a communication-efficient protocol for binary Byzantine agreement in the partially synchronous model (even when processors are synchronous) that achieves agreement with high probability against an adaptive adversary when honest replicas multicast messages.
Thus, it seems fruitful to look for alternative models of partial synchrony modeled after the GST model provided in [DLS88] to achieve communication-efficiency. We do so in our third main result.
Main Result 2: Consensus using VDFs (Section 6).
We introduce a new randomized communication-efficient consensus protocol based on Verifiable Delay Functions (VDFs) that is safe even against (weakly) adaptive adversaries in the synchronous model. This protocol does not require proof-of-work or the memory-erasure model and can withstand the case when adversaries can arbitrarily choose the inputs of Byzantine nodes as well as the transactions and proposals of each such node.
Theorem (informal). Suppose honest replicas can compute a VDF with difficulty in time and Byzantine replicas can compute the same VDF in time. There exists a communication-efficient consensus protocol for any positive constants and that reaches consensus in rounds even in the presence of adaptive adversaries in the synchronous model with overwhelming probability (in the security parameter ) and high probability in assuming the total number of replicas is where is the number of Byzantine replicas.
Intuitively, VDFs guarantee that obtaining the output of the function given an input requires some number of sequential steps (for a chosen when the function is initialized) even when parallel processors are available. Verifying the output of such a function only requires steps. Although VDFs require sequential computation, this amount of computation is vastly less than the computation necessary to perform proofs-of-work since the ability for adversaries to parallelize the work has been eliminated (so more hardware–up to reasonable sizes–does not imply a bigger advantage). We use VDFs instead of the memory-erasure model assumptions to protect against adaptive adversarial corruptions of important proposers and committees. The adversary must compute a VDF in order to send more messages. However, we must solve some number of challenges including when adversaries can potentially have fast VDF solvers that take some constant fraction of the amount of time required by VDF solvers held by honest replicas. A description of these challenges and their solutions are presented in Section 6.
Main Result 3: Communication-Efficiency under Adaptive Adversaries in the Partially Synchronous with Randomly Dropped Messages Model (Section 7).
Due to our impossibility results, it seems necessary to relax the assumptions of the partially synchronous model slightly in order to obtain meaningful communication-efficient protocols for binary Byzantine agreement. Thus, we formulate the partially synchronous with randomly dropped messages network model where during the asynchronous period, each message has probability of being dropped. Thus, the adversary no longer is able to selectively drop messages during the asynchronous period. We show that in this model, we can have a communication-efficient protocol using honest multicast that reaches agreement with high probability.
Theorem (informal). There exists a communication-efficient protocol which reaches binary Byzantine agreement in rounds after GST with high probability in the partially synchronous with randomly dropped messages network model under (weakly) adaptive adversaries.
2 Related Work
2.1 Consensus Protocols and Adaptive Adversaries
|Consensus Protocol||Network Model||Multicast Complexity||Assumptions|
|Algorand [GHM17]||Synchronous||Memory-erasure, PKI|
|Herding [CPS19]||Synchronous||Filtering transactions by age, PKI|
|Ouroboros [DGKR18]||Semi-Synchronous||Memory-erasure, PKI|
|Nakamoto [N08]||Synchronous||Proof of Work|
|[ACD20]||Partially Synchronous (Fixed, but unknown )||BBA, PKI|
|This work||Synchronous||VDFs, PKI|
|This work||Partially Synchronous Randomly Dropped Messages||BBA, PKI|
Traditional consensus protocols [DS83, DLS88] require all replicas to send messages to all other replicas, resulting in communication complexity in a network with replicas. Because they have such large communication complexity, most of these protocols can be modified to account for adaptive adversaries. Some [ADD19, DS83, KK09] can even be shown to be secure for strongly adaptive adversaries that can perform after-the-fact removal. However, for our intended application to large-scale distributed systems such as decentralized cryptocurrencies, we would like protocols with lower communication complexities.
Leader election-based consensus protocols [CL99, YMR19] reduce communication complexity by electing a single leader per round who aggregates votes. These protocols do not easily tolerate an adaptive adversary. HotStuff [YMR19], using a 3-round pipelined protocol, uses signature aggregation techniques to reduce authenticator complexity (number of digital signatures or message authentication codes sent in messages) to . HotStuff also has the nice property of responsiveness; it proceeds at actual network delay instead of worst case network delay. We use HotStuff’s clever 3-round protocol in both our synchronous consensus protocol (Section 6) and in our partially synchronous clock synchronization protocols (Appendix B,Section 7). However, a straightforward application of HotStuff is not sufficient to achieve subquadratic message complexity while tolerating adaptive adversaries: an adversary could continually corrupt the leader for at least rounds, forcing a quadratic number of messages before finding an honest leader and reaching agreement. A primary contribution of our work is showing how to prevent these types of attacks.
Other recent works have been able to lower the communication complexity by using additional techniques. The breakthrough work of King and Saia [KS11] presented a binary Byzantine Agreement protocol in the adaptive adversaries setting with communication complexity with the assumption of authenticated channels. As in Algorand and Micali-Vaikuntanathan [GHM17, MV17], King and Saia [KS11] also assume that replicas can securely erase secrets from memory. Other works like the sleepy model of consensus [PS17] and Ouroboros [DGKR18] also use the memory-erasure model. As discussed in Canetti et al. [CEGL08], erasures are hard to perform in real software.
The famous Nakamoto consensus protocol [GKL15, N08, PSS17, Ren19] achieves communication complexity assuming perfect proof-of-work in the synchronous model even under adaptive adversaries. This work proposed what is known as the longest-chain strategy, which results in eventual consensus. More recent protocols [DPS19, DGKR18, KR18, KRDO17, PS17, Shi19] also follow Nakamoto’s longest-chain strategy but unlike Nakamoto consensus, they remove the proof-of-work assumptions by using a permissioned setting with a public-key infrastructure. In these protocols, a replica has some chance of being elected as leader in each round. When a replica is elected as leader, it signs the block extending the current longest chain. For such protocols to exhibit both safety and liveness, some additional constraints have to be imposed on the validity of the timestamps contained in the blockchain. However, these works do not guarantee small turnover time for adaptive adversaries if the memory-erasure model is not used regardless of whether the leader election is randomized [DPS19, KRDO17, PS17] or deterministic [KR18, Shi19]. In fact, the number of rounds to consensus could be near-linear since the adaptive adversary can continuously corrupt the small number of players who talk.
A key way to achieve communication-efficiency is electing a small (-sized) committee to run a step of the protocol [GHM17, ACD19, CPS19, DGKR18, DPS19, HMW18]. This committee is much smaller than the typical or ideal number of corruptions to tolerate, and as such, the adversary can compromise safety by corrupting the entire committee and voting for two different values at the same time. Algorand gets around this using memory-erasure; keys are ephemeral and thus not available to vote for another value [GHM17, MV17]. [ACD19] tolerates an adaptive adversary for binary Byzantine Agreement by leveraging the innovative idea of vote-specific eligibility: by tying voting eligibility to the proposal, the adversary cannot simply compromise the leader and elected committee after they send a message and force them to vote for two values at the same time. This is because most likely, the proposers and/or committees for the two values will be different (or have very small overlap); thus compromising one committee for one proposal does not ensure committee membership for a different proposal. Though this works for binary Byzantine Agreement, it does not extend to consensus for general values because it introduces what we call vote grinding: the adversary can try many different input values to influence committee selection and create a biased committee, as noted in [CPS19]. In an updated version of their work, they provide a BBA protocol that works in a partially synchronous network, however, they use a different model where is fixed but unknown, while our lower bound is in the model where only holds after a Global Stabilization Time (GST) [ACD20].
Chan, Pass, and Shi [CPS19] nicely build on ideas from both of these works and achieve communication-efficient consensus with an adaptive adversary using vote-specific eligibility and the novel idea of batch agreement: transactions, batched together in a block proposal, are scored according to when the replica first saw the transaction; older transactions score higher than new. The adversary cannot try many different values to influence the committee because honest participants will only vote for the highest-scoring block. Unfortunately, it is unclear how this might work in practice; many blockchains sort transactions by fees instead of first-seen in order to rate limit and deter spam [N08, W14]. Straightforwardly sorting by transaction fee instead of age in [CPS19] would mean that an attacker could continuously create many self-spending high-fee transactions and send them to different honest replicas, making the honest replicas disagree on the highest scoring block. Unlike what occurs with old transactions (at some point, everyone agrees on the set of oldest transactions), adversaries can keep on generating different higher-fee transactions, leading to indefinite disagreements. Table 1 summarizes the differences between our work and these other communication-efficient consensus protocols that tolerate adaptive adversaries.
Other works [CCGZ19, GKKZ11, HZ10] have looked at adversaries whose corrupting powers are delayed by a round but for Byzantine Broadcast, which is a different problem than what is considered in this paper. They have focused on a simulation-based notion of adaptive security for Byzantine Broadcast, where the concern is that the adversary should not be able to observe what the sender wants to broadcast, and then adaptively corrupt the sender to flip the bit. They use what is called the atomic message model where after adaptively corrupting a replica the adversary cannot erase the message already sent this round and also must wait for at least one maximum network delay before the corrupt can start sending corrupt messages.
2.2 Lower Bounds for Binary Byzantine Agreement Protocols
|Work||Type||Network Model||Adversary||Lower Bound||Even Assuming|
|[DR85]||Deterministic||any||Static or stronger||Authenticated Channels|
|This work||any||any (safety guaranteed with probability )||Static or stronger||PKI|
|This work||any||Partially Synchronous (GST)||Adaptive||PKI|
Previously, Abraham et al. [ACD19] have shown that (possibly randomized) protocols that achieve subquadratic message complexity cannot tolerate a strongly-adaptive adversary. The proof of their lower bound is inspired by Dolev and Reischuk [DR85] who showed that any deterministic consensus protocol must incur communication complexity when assuming authenticated channels. Abraham et al. [ACD19] also show that without a PKI, no protocol with multicast complexity can achieve consensus under adaptive corruptions even in the synchronous model, when assuming the existence of a random oracle or a common reference string, and even in the memory-erasure model. Table 2 compares these lower bounds to ours. Some other works have achieved expected quadratic communication complexity under various settings that are similar to adaptive adversarial settings [AMN19, AMS19] in modified synchronous and asynchronous models.
Other Lower Bound Results
Previously, [CMS89, KY84] showed that any randomized -round protocol must fail with probabilite at least for some constant ; in particular, randomized agreement with sub-constant failure probability cannot be achieved in strictly constant rounds. Attiya and Censor-Hillel [AC08] extended the results of [CMS89, KY84] on guaranteed termination of randomized BA protocols to the asynchronous setting, and provided a tight lower bound. Much more recently, following a series of works looking at lower bounds on the expected number of rounds necessary to achieve Byzantine agreement of randomized protocols, Cohen et al. [CHM19] show that BA protocols resilient against adaptive corruptions terminate at the end of the first round with probability among other results.
2.3 Consensus with Verifiable Delay Functions
Verifiable Delay Functions (VDFs) were first introduced in [BBBF18], with a related precursor in [LW15]. Newer blockchain protocols use VDFs in consensus protocols (with various other assumptions) as an unbiasable source of randomness or as a source of timing to progress rounds [AMM18, Dra, CP19]. To the best of our knowledge, we are the first to use Verifiable Delay Functions not as a source of randomness (leader and committee election are independent of VDF output) but to bound the number of messages an adversary can send, specifically with the purpose of deterring adaptive corruptions.
There are participants in the network and the public keys of all participants are common knowledge. We only consider systems consisting of replicas where is the maximum number of Byzantine replicas present in the system for the duration of the protocol.
In this paper, we only consider protocols (in both our impossibility results and our protocol formulations) where the honest replicas multicast their messages. Consistent with the termininology given in [ACD19] and [CPS19], we use the term multicast to indicate when a replica sends a message to all replicas in the network. Henceforth, we talk about the communication complexity222Consistent with the terminology used in [ACD19], we refer to communication complexity as the total number of messages sent in the network by honest replicas. Unlike other commonly used notions of communication complexity, we are not referring to the total number of bits sent in the network. in terms of the multicast complexity (i.e. the number of multicasts)333Note here that we explicitly count only the number of multicasts as opposed to the total number of bits sent in all messages. This is due to the fact that all messages sent by networks using a PKI require signatures of size under standard cryptographic assumptions. Furthermore, it is difficult to standardize such a measure as the number of bits of a message also depends on the size of the proposal/transaction/function/etc. as opposed to the point-to-point communication complexity as conventionally stated in the literature. Honest replicas multicast all messages, but Byzantine nodes may send point-to-point messages to anyone in the network. This means our goal is to achieve sublinear multicast complexity, or subquadratic communication complexity. Replicas communicate with each other in a network via authenticated channels. In Section 6, we are operating in the synchronous network model; the protocol proceeds in rounds and channels may exhibit communication delay which we model as . Messages reach their intended recipient after up to delay. In Section 5 and Appendix B, we consider a partially synchronous network where communication delay is unbounded until some Global Stabilization Time (GST) after which delay is bounded by . 444There are also several other partially synchronous models of consensus, which we do not consider in this paper.
We assume as in [ACD19, CPS19] that honest replicas interact with some environment (where is the security parameter) that sends them inputs at the beginning of every round, and honest replicas may send outputs to the environment at the end of every round. We assume that honest replicas attempt to reach consensus on one of the inputs they received from at the beginning of the protocol. Honest replicas follow the protocol when determining their outputs/messages.
We assume that Byzantine replicas are controlled by some adversary which reads each of their inputs, received messages, and has accesss to their internal states. Then, decides the Byzantine replicas’ outputs/messages. Crucially, the outputs/messages sent by Byzantine replicas could have no relation to the inputs received by these replicas. Such replicas can output/send any number of arbitrary messages independent of what they receive from .
Throughout this paper, we only consider adaptive adversaries, although one of our impossibility results holds even for static adversaries. While static adversaries can only corrupt up to replicas before the start of the protocol, adaptive adversaries are defined as adversaries which can corrupt up to replicas adaptively, at any point during the execution. When an adaptive adversary corrupts a replica that was previously honest, it gains access to the replica’s internal state (including its private key), and, henceforth, controls the corrupted replica. A corrupted replica remains Byzantine for the remainder of the execution of the protocol. does not have access to the internal states of the honest replicas. We assume that also has polynomially bounded parallel processing power and cannot guess the secret keys of honest replicas with high probability. 555With high probability (whp) is defined in our paper to be probability at least for all constants . can coordinate the Byzantine replicas, and can read all messages sent through the network, but cannot erase or alter messages sent by honest replicas.666In some previous literature (e.g. [ACD19]), this type of adaptive adversary is referred to as a weakly adaptive adversary.
As in [ACD19], we define replicas which are honest at the current time to be so-far honest, and replicas which remain honest till the end of the protocol to be forever honest. We also assume that in the synchronous model, can reorder the messages received by any replica and can delay any message an arbitrary amount of time . In the partially synchronous model, we assume that can selective choose arbitrarily large delays for messages during the asynchronous phase and can drop or reorder any number of messages during that phase. After GST, we assume follows the behaviours of a synchronous adversary.
In the adaptive adversary model, all forever-honest replicas must agree on exactly one input given to a forever-honest replica by at the beginning of the protocol, with high probability with respect to the number of nodes in the protocol and the security parameter . 777We generally assume that is at least polynomial in : . More specifically, a correct protocol in our paper maintains the following two safety and liveness guarantees:
Safety: No two honest replicas commit to two different values with high probability with respect to and .
Liveness: The protocol terminates in rounds w.h.p. with respect to and .
Additional background on the network and adversarial models, as well as a more detailed explanation of the challenges facing protocol designers can be found in Appendix E.
The protocols and impossibility results discussed in this paper rely on two main cryptographic primitives: verifiable random functions (VRFs) and verifiable delay functions (VDFs). We assume standard cryptographic assumptions. We first define the cryptographic primitives we need in this paper and then define the various other notation we use throughout the paper.
4.1 Cryptographic Primitives
For all of our protocols, we assume that a trusted setup phase is first used to generate a public-key infrastructure (PKI) where each replica obtains a cryptographic sortition public key/private key pair: (such a key pair could be a verifiable random function (VRF) [MVR99] public key/private key pair).
For clarity we provide a simplified, informal definition of cryptographic sortition (which can be implemented via VRFs) here; to see the full formal definition of VRFs [MVR99], please refer to Section C.1.
Cryptographic sortition ensures the following three properties:
Replica using its secret key (and some public, common input) can determine whether they are part of the voting committee and produce some output.
All other replicas can verify (but not produce with all but negligible probability in the security parameter ) replica ’s output using .
Lastly, the output is unique and is indistinguishable from random with high probability.
As in [ACD19], we use the notation for replicas to use as an oracle for determining whether they are eligible to vote in a committee. satisfies the properties of cryptographic sortition as stated above. More specifically, is parameterized by replica ’s secret key , takes some input , , and returns some output that is generated uniformly at random via some coin flip with appropriate probability. Furthermore, can provide some verification to other replicas that use only and some additional information that is given as output from the function. We let the output value and proof be and , respectively. One possible instantiation of is via verifiable random functions. Please refer to Section C.1 for the full formal definition of VRFs.
In our paper, we also make use of an additional cryptographic primitive called verifiable delay functions (VDFs) [BBBF18]. A VDF is a function that guarantees with all but negligible probability in that computing the function takes some sequential steps by some measure of difficulty of the function. number of sequential steps is required even given polynomial number of parallel processors. We present the full formal definition of VDFs in Appendix D. In this paper, we let be a VDF with difficulty . In our exposition, we assume that the evaluation and verification keys are implied and passed into the function so we do not expressively pass in these as parameters into the function. takes as input some and outputs some output , , where includes both the value of the output as well as the proof.
4.2 Other Notations and Definitions
We make abundant use of the Chernoff bound in our paper.
Definition 4.1 (Chernoff Bound).
Let be independent random variables
that take on values in
independent random variables that take on values inwhere and . For any , the multiplicative Chernoff bound gives
We use the phrase “with high probability” many times throughout this paper. When we say “with high probability”, we mean with high probability with respect to and with overwhelming probability with respect to ; in other words, with probability at least for all constants . Throughout the paper, we assume .
5 Impossibility Results for BBA Using Sublinear Multicasts
In this section, we present two impossibility results regarding BBA protocols with adaptive adversaries: First, we show that it is impossible to always achieve BBA in even the synchronous network using a sublinear number of multicasts (this implies it is also impossible in the partially synchronous model). Then, we show that it is impossible to achieve BBA with high probability in a partially synchronous network (in the GST model) in multicasts. Both of these results are under our definition of BBA in a network where honest replicas are only allowed to multicast messages. We consider the specific binary Byzantine agreement problem that is defined in [ACD19].888This was also referred to in later works as multi-value agreement [CPS19]. We redefine the problem here for convenience:
Definition 5.1 (Binary Byzantine Agreement Problem (BBA)).
Given a network with replicas, each replica receives an input bit . The problem asks whether all replicas can reach an agreement that satisfies the following properties with high probability:999High probability is generally defined to be probability for all constants .
Termination: Every forever-honest replica outputs a bit .
Consistency: If two forever-honest replicas output and , respectively, then .
Validity: If all forever-honest replicas receive the same input bit , then all forever-honest replicas ouput .
The proofs we present only apply to protocols where all honest replicas multicast messages, meaning they, by their protocols, do not selectively choose to send messages to a specific replica but instead multicast all messages to all replicas. The Byzantine replicas are not constrained in this way and can send any number of point-to-point messages. Our impossibility results apply to protocols with this assumption. We define this property as the honest total multicast property:
Definition 5.2 (Honest Total Multicast Protocols).
Protocols where honest replicas multicast all messages to all other replicas. Thus, the multicast complexity for such protocols equals the number of times honest replicas multicast messages.
Any correct honest total multicast protocol in the synchronous model with multicast complexity has at most honest replicas which multicast before consensus is reached.
The proof of the aforementioned lemma immediately follows from the definition of honest total multicast protocols.
A honest total multicast protocol that uses sublinear multicasts with high probability and always reaches BBA in the synchronous model cannot exist, even against a static adversary.
Supppose, for the sake of contradiction, that we have a correct honest total multicast BBA protocol that achieves sublinear multicast complexity with high probability and always reaches agreement on a bit. Then, suppose that during one iteration of the protocol on a set of replicas, the protocol reaches agreement wlog on the bit . Such an iteration must exist since the protocol must reach agreement on if e.g. all inputs to all replicas is . Let this iteration of the protocol be . Since the protocol guarantees agreement in sublinear multicast complexity with high probability, we can also assume uses sublinear number of multicasts (as such an iteration must exist). Thus, there exists some fraction of replicas which never multicast any messages in . Let this set of replicas be . Let the set of replicas that multicast at least one message be . We know that by Lemma 5.3. Let reach agreement in synchronous rounds.
Suppose we have another iteration of the protocol on the same set of replicas, but where agreement is reached on . Again, such an iteration must exist since all honest replicas must output if e.g. all inputs to honest replicas are . Let this iteration of the protocol be . Let the set of replicas which never multicast any messages be and the set of replicas that multicast at least one message be . As before, we know that by Lemma 5.3. Let reach agreement in synchronous rounds.
Suppose the adversary picks Byzantine replicas initially before the start of the protocol uniformly at random. Let be a simulation of the protocol on the set of replicas where all replicas in are initially corrupted by the adversary. This is possible for large enough since . Furthermore, let half of the replicas in have input and have the same internal state as the same replicas in iteration . Let this half be . Let the other half of the replicas in have input and have the same internal state as the same replicas in iteration . Let this half be . Such a simulation is a potential iteration of the protocol since before any messages are sent the internal states of all replicas are determined solely by their inputs and their private random coin flips.
The adversary in simulation then sends two sets of messages by controlling the replicas in . They send the same messages as in iteration to all replicas in and the same messages as in iteration to all replicas in . In this simulation, we assume all private coin flips for replicas in correspond with the same replicas in and all private coin flips for replicas in correspond with the same replicas in . Then, the replicas in have no way to distinguish from and will output . Similarly, the replicas in have no way to distinguish from and will output .
Thus, we reach a contradiction as honest replicas agreed on and honest replicas agreed on . Thus, there does not exist a honest total multicast protocol that always reaches BBA in the synchronous model, even against a static adversary, as there exists a potential simulation of the protocol that reaches agreement on two different bits. ∎
Our next impossibility result shows that there does not exist a partially synchronous BBA protocol (in the GST model) with an adaptive adversary that achieves agreement in multicasts. We need to be somewhat careful in our definition of multicast complexity in the partially synchronous model so that we obtain a definition that is meaningful. What makes the partially synchronous model with adaptive adversaries appealing is that it accurately simulates the real world: dropped messages can be simulated by an adversary which doesn’t send messages (or selectively sends messages) to different replicas. We define the multicast complexity to be the total number of multicasts necessary after the global stabilization time (GST) before Byzantine agreement is reached. In contrast to the synchronous model, the asynchronous period starts at the beginning of the protocol and continues for unknown, but bounded time. However, for the partially synchronous model, we assume the synchronous period after one GST must be long enough for the protocol to reach consensus.101010In a system model where there can be multiple synchronous periods separated by asynchronous periods and thus multiple GSTs, the synchronous period after a GST only needs to last long enough for one round of the protocol to complete.
Our proof uses the lower bound proof given in Theorem 4 of [ACD19].
There does not exist a partially synchronous BBA protocol resilient against adaptive adversaries where all honest replicas reach agreement with high probability in multicasts given Byzantine replicas and for all number of replicas in the network.
We show in Section 7 a BBA protocol that achieves agreement with high probability in a new, weaker adverarial model than the partially synchronous (GST) model.
6 Consensus with Adaptive Adversaries using Sublinear Multicasts
We use the concepts expanded upon in the previous sections to formulate a communication-efficient consensus protocol without the use of the memory-erasure model and which can be adapted to a variety of transaction ordering schemes (e.g. for use in cryptocurrency applications). Namely, we make use of several important concepts in formulating our protocol: verifiable delay functions (VDFs) [BBBF18], random leader/committee elections, and the three-step commit rule of HotStuff [YMR19]. The consensus protocol we describe in this section operates in the synchronous model and can tolerate up to adaptive Byzantine corruptions.
First, we provide a brief description and a simplified version of our protocol in Section 6.1. Then, we describe the full detailed version of our protocol in Section 6.2. In our protocol, safety and liveness hold with high probability with respect to and using number of messages or multicasts. The exact multicast complexity, round complexity and the proof of high probability by which this holds provided in Theorem 6.1 is proven later in our analysis in Section 6.3. As we showed in our lower bound result presented in Section 5, we cannot guarantee that safety always holds given a protocol that uses sublinear multicasts even in the synchronous model and even given a static adversary. Thus, our protocol ensures the best possible guarantees under the constraints we are operating under: both safety and liveness with high probability with respect to and .
Assuming a valid VDF construction that satisfies Definition D.1, there exists a consensus protocol that terminates in rounds and reaches consensus using multicasts with high probability with respect to and , even when assuming the adversary can perform VDF computations faster by any constant factor .
6.1 Protocol Overview
In our protocol, we divide the communication rounds into epochs
where each epoch goes through a leader election as well as severalrounds of communication to confirm a leader’s proposal. A leader is elected after each honest replica queries with its secret key and epoch number as input. Recall from Section 4 that each replica has oracle access to an oracle which will produce some output and potentially a proof. The leader ,111111With high probability in rounds , there will be one round where there is only one leader. then computes a VDF output of the value wants to propose. After computing this VDF output, sends the VDF output, the proposal, the output of and proofs to all other replicas via a multicast.
After a proposal (with an attached VDF output and proof) is made by , some number of replicas are elected into committees to vote on the proposal. We use a total of three uniformly at random chosen committees, similar to the three-step commit rule of HotStuff [YMR19], to determine when a proposed value is committed. However, unlike HotStuff, our committees are polylogarithmic in size with respect to the number of participants in our consensus protocol. As in previous works which use player-replaceability (e.g. [GHM17]), each committee is chosen independently, likely with an entirely new set of participants.
To determine membership in a committee, each replica passes into as input the epoch number and a label for the committee it is attempting to participate in. Each committee only votes for proposals proposed in the current epoch; they will never vote for a proposal that was proposed in the previous epoch or a future epoch. After a committee member has been chosen to participate in a committee, they must compute a VDF on their intended vote; otherwise, honest replicas will not accept the vote without a corresponding VDF output. When a replica multicasts its vote, it multicasts its vote along with its output, the VDF output, and all associated proofs.
To instantiate the VDFs we use in our protocol, we can use a number of recent VDF constructions by [Wes19, Pie19, DGMV19] (some of which do not need trusted setup). They show constructions for VDFs that, given a difficulty level , can be computed in time and verified in time given a small number of processors. But such constructions also guarantee that even given polynomially many parallel processors121212For an arbitrary polynomial., computing the output must take at least parallel time for small . The formal definitions of such functions are given in the Preliminaries (Section 4).
Although, theoretically, most VDF constructions with the same difficulty must be computed within some additive factor of one another, our protocol can in fact handle any VDF instantiations (in practice) where the speed of computation of the VDFs differ by any constant multiplicative factor. This means that our protocol is secure (w.h.p.) even when considering adversaries which may have faster VDF computing potential up to any positive constant multiplicative factor.
We now formally describe our protocol below.
6.2 Detailed Protocol
Our detailed consensus protocol shown in Fig. 2 is run by every honest replica . maintains the private state which is the current epoch that is on. Recall that we defined an epoch to be a period of time consisting of many communication rounds in which voting for a particular proposal is done. In our protocol detailed below, each epoch consists of communication rounds; while the adversary can determine the order of messages that arrive to replicas in our protocol, they cannot delay any message by more than delay.
Note that in contrast to other works which uses a VDF to compute an unpredictable source of randomness, we simply use the VDF to enforce that the creation of a proposal or vote take some fixed amount of time. In our protocol, leaders and committees are privately predictable— a replica can predict for which values of it will be leader or on a committee. As in [ACD19, CPS19], since we are operating in a permissioned system (with replicas), this does not affect the correctness of our protocol.
Figure 1 shows a simplified visual representation of our protocol.
As before, we define the following terms, a round of a replica consists of sending and/or receiving a set of messages (in other words, one round of communication) and an epoch is defined to be one iteration of the while loop defined in the protocol given in Fig. 2. Assuming message delay, we first prove that if there is exactly one leader–which is honest, there are honest replicas in each committee, and there are Byzantine replicas in each committee, then we can reach consensus on the leader’s proposal given appropriate initial settings of the parameters.
Let , and be constants . We assume that the slowest honest replica takes time to compute a VDF of difficulty , the fastest honest replica takes time to compute the VDF, and any Byzantine replica takes time to compute the VDF. We show that our protocol accounts for the most interesting settings of the parameters: ; in the case when the adversary computes the VDF slower than honest replicas, security can be proven trivially. Let be the total time (in terms of ) that each epoch consists of and is the corresponding number of communication rounds.131313In the case when is not divisible by , we can increase the duration of such that it becomes divisible by . We give the exact bounds for these variables, and , in our proofs (in terms of ). Throughout our proofs, we let be the difficulty level of .
Let be the message delay. For epoch , suppose that there is exactly one leader, which is honest, there are honest replicas in each committee, and there are Byzantine replicas in each committee. When and , there exist values in terms of , , , and that allow for the leader’s proposal to be committed by all honest replicas with high probability with respect to and .
In the case where there is exactly one leader, who is honest, and all committees have honest replicas and Byzantine replicas, each leader and honest committee member will send out exactly one proposal/vote. However, the adversary can potentially choose to adaptively corrupt the leader and/or committee members and send out multiple proposals if the difficulty levels of our VDFs are not set appropriately. The only way that an adversary can send multiple proposals or votes is if they compute the VDFs associated with the proposals or votes. Since we assume that the adversary cannot guess the private keys of the honest replicas with all but negligible probability in , they cannot compute the VDFs of the extra proposals and votes until after they corrupt the replicas with all but negligible probability in by Definition D.1. By the assumptions given in the lemma statement, initially both the leader and majority of committee members are so-far honest. Thus, we need only concern ourselves with the cases when the honest replicas are corrupted after they announce their leadership/committee status.
In order to prevent the leader from sending multiple proposals, the leader must not have enough time to compute a new VDF output on a new proposal after computing the current VDF output on the proposal they have already multicasted. Recall that by our definitions of and , the fastest that an honest replica can compute is and any Byzantine replica must take at least time to compute .
We must ensure that each time a replica computes a VDF and sends the result, the adversary does not have enough time to compute another value for the VDF before we proceed with the next epoch. Thus, the difficulty levels of the VDFs must be set accordingly. Let be the time that an epoch lasts (in terms of ) before we proceed to the next epoch. Then, for example, for the leader proposal round, the amount of time it takes for the fastest honest replica to compute the corresponding proposal VDF plus the time it takes for the adversary to take control of the honest proposer and compute another VDF must be longer than the length of the epoch. The constraint on the difficulty level must then follow: . Following this pattern, the remaining difficulty terms must follow similar constraints. Intuitively, this also means that . Finally, must be long enough so that honest replicas can compute, receive, and verify all VDF outputs so they can commit a proposal if the conditions of the lemma are followed.
From the intuition above, the difficulty levels that are set must specifically follow the following constraints:
We solve this set of equations to obtain the following set of expressions for , , , and in terms of :
Substituting the above into Eq. 5 gives us a lower bound for from which we can also derive the other values. First we replace with for some small constant for all .
Substituting the expression for will lead to values of , , and in terms of the values of , , and .
This expression is valid iff
Eq. 15 is always true for all . Hence, we need only concern ourselves with the constraint defined by Eq. 14. Assuming that is negligible141414Given that we pick such that , if is not negligible, then we can increase the delay in Eq. 5 to something greater to account for the time necessary to verify the VDF computations., we can simplify to obtain:
For all values of , , we obtain a bound for where there exist values we can set such that . We have thus proven that there exist values of given and that we can set to prevent violation of safety by the corruption of so-far honest replicas.
In such cases, when the conditions given in the statement of the lemma are followed, given exactly one honest proposer and committees dominated by honest replicas, the adversary is not able to produce additional proposals or votes with all but negligible probability in . Furthermore, the adversary does not have enough time to corrupt an honest replica and compute the associated message or vote VDF before the epoch has progressed to the next epoch.
Since a single honest leader will always propose exactly one proposal, all honest replicas will vote for the same proposal, reaching the necessary number of votes. Hence, the leader’s proposal will be committed by all honest replicas. ∎
Now, we remove the constraint of by assuming that each honest replica with a faster VDF implementation than can choose to delay sending their proposal or vote until after the time that it would have taken the replicas that take time to compute and verify the VDFs. This immediately allows us to conclude that our protocol can handle any constant values of (since the constraint in Lemma 6.2 is trivially satisfied). For Corollary 6.3, we assume that all honest replicas compute the VDFs with speed .
Let be the message delay. For epoch , suppose that there is exactly one leader, which is honest, there are honest replicas in each committee, and there are Byzantine replicas in each committee. When , there exist values in terms of , and that allow for the leader’s proposal to be committed by all honest replicas.
In the rest of this section, we prove the safety and liveness of our consensus protocol which directly leads to the proof of Theorem 6.1.
We first show that each epoch consists of a constant number of rounds.
Each epoch consists of communication rounds.
We now show that, with high probability, the conditions stated in Lemma 6.2 and Corollary 6.3 can be satisfied. To do this, we first show that with high probability, after epochs, there will exist an epoch which has exactly one so-far honest leader.
After epochs, there will be at least one epoch in which there exists exactly one leader and that leader is honest.
At the beginning of epoch , at most replicas are Byzantine when the leader is chosen. Therefore, the probability that an already-Byzantine node is chosen is . Thus, the probability that a Byzantine node is chosen to be a leader for every epoch after epochs is . Thus, with high probability, after epochs, there will exist at least one epoch where no Byzantine replicas are elected as leaders. By the Chernoff bound, the probability that more than one leader is elected in every epoch after epochs is . The probability that no leaders are elected after rounds is . By the union bound, the probability that any of the above three bad cases occur after rounds is bounded by for all . Thus, with high probability, there exists at least one round in which there exists exactly one leader and that leader is honest. ∎
Suppose that the number of Byzantine replicas, is given by for some constant provided (in Fig. 2). Then, there exist an arbitrarily small constant such that after epochs, there will be at least one round where all committees have honest replicas in each committee, and there are Byzantine replicas in each committee with probability for some constants , , and .
The expected number of honest replicas that will be chosen for any committee is given by since . By the Chernoff bound, the probability that less than honest replicas are chosen into the committee is . In order for the number of honest replicas to be , we must have . Thus, we obtain . Since , there always exist values of and such that the condition is satisfied. The probability that after epochs there exists an epoch with honest replicas in each committee is then given by .
The expected number of Byzantine replicas that will be chosen for any committee is given by . By the Chernoff bound, the probability that replicas in the committee are Byzantine replicas is given by . In order for the number of Byzantine replicas to be , we must have . Solving, we obtain . Since and , there always exists a value that satisfies this inequality. The probability that after epochs there exists an epoch where Byzantine replicas are in each committee is then given by .
The probability that both conditions are satisfied is
Thus, there exist constants , , and where the probability that both conditions are satisfied is . ∎
Suppose that the number of Byzantine replicas, , is given by for some constant provided (in Fig. 2). With high probability, after epochs, there will be at least one epoch where all committees have honest replicas in each committee, and there are Byzantine replicas in each committee.
By Lemma 6.6, the probability that the conditions of this corollary are satisfied given constants , , and is . Since , , and are constants, the probability that the conditions of this corollary are satisfied is for any constant . ∎
After epochs, for any constant , the probability that the result of the selection of replicas for committees gives votes for all committees of an epoch is