State machine replication (SMR) is a fundamental problem in distributed computing [17, 18, 30] that can be viewed as a generalization of Byzantine agreement (BA) [29, 19]. Roughly speaking, a BA protocol allows a set of parties to agree on a value once, whereas SMR allows those parties to agree on an infinitely long sequence of values with the additional guarantee that values input to honest parties are eventually included in the sequence. (See Section 3 for formal definitions. Note that SMR is not obtained by simply repeating a BA protocol multiple times; see further discussion in Section 1.1.) Moreover, these properties should hold even in the presence of some fraction of corrupted parties who may behave arbitrarily. SMR protocols are deployed in real-world distributed data centers, and the problem has received renewed attention in the context of blockchain protocols used for cryptocurrencies and other applications.
Existing SMR protocols assume either a synchronous network, where all messages are delivered within some publicly known time bound , or an asynchronous network, where messages can be delayed arbitrarily. Although it may appear that protocols designed for the latter setting are strictly more secure, this is not the case because they also (inherently) tolerate a lower fraction of corrupted parties. Specifically, assuming a public-key infrastructure is available to the parties, SMR protocols tolerating up to adversarial corruptions are possible in a synchronous network, but in an asynchronous network SMR is achievable only for faults (see ).
We study here so-called network-agnostic SMR protocols that offer meaningful guarantees regardless of the network in which they are run. That is, fix thresholds with . We seek to answer the following question: is it possible to construct an SMR protocol that (1) tolerates (adaptive) corruptions if the network is synchronous, and moreover (2) tolerates (adaptive) corruptions even if the network is asynchronous? We show that the answer is positive iff .
Our work is directly inspired by recent results of Blum et al. , who study the same problem but for the simpler case of Byzantine agreement. We match their bounds on ; since SMR implies BA, even in the network-agnostic setting we consider (cf. Section 6.1), their impossibility result implies that the thresholds we obtain are optimal for our setting as well. While the high-level structure of our SMR protocol resembles the high-level structure of their BA protocol, in constructing our protocol we need to address several technical challenges (mainly due to the stronger liveness property required for SMR; see the next section) that do not arise in their work. Of additional interest, we also extend their impossibility result to show that it holds in a proof-of-work setting.
1.1 Related Work
There is extensive prior work on designing both Byzantine agreement and SMR/blockchain protocols; we do not provide an exhaustive survey, but instead focus only on the most relevant prior work.
As argued by Miller et al. , the well-known SMR protocols that tolerate malicious faults (e.g., [6, 15]) require at least partial synchrony in order to achieve liveness. Their HoneyBadger protocol  was designed specifically for fully asynchronous networks, but can only handle faults even if run in a synchronous network. Blockchain protocols are typically analyzed assuming synchrony [11, 25]; Nakamoto consensus, in particular, assumes that messages will be delivered much faster than the time required to solve proof-of-work puzzles.
We emphasize that SMR is not realized by simply repeating a (multi-valued) BA protocol multiple times. In particular, the validity property of BA only guarantees that if a value is input by all honest parties then that value will be output by all honest parties. In the context of SMR the parties each hold multiple inputs in a local buffer (where those inputs may arrive at arbitrary times), and there is no way to ensure that all honest parties will select the same value as input to some execution of an underlying BA protocol. Although generic techniques for compiling a BA protocol into an SMR protocol are known , those compilers are not network-agnostic and so do not suffice to solve our problem.
Our work focuses on protocols being run in a network that may be either synchronous or fully asynchronous. Other work looking at similar problems includes that of Malkhi et al. , who consider networks that may be either synchronous or partially synchronous; Liu et al. , who design a protocol that tolerates a minority of malicious faults in a synchronous network, and a minority of fail-stop faults in an asynchronous network; and Guo et al.  and Abraham et al. , who consider temporary disconnections between two synchronous network components.
A slightly different line of work [26, 27, 22, 21] looks at designing protocols with good responsiveness. Roughly speaking, this means that the protocol still requires the network to be synchronous, but terminates more quickly if the actual message-delivery time is lower than the known upper bound . Kursawe  designed a protocol for an asynchronous network that terminates more quickly if the network is synchronous, but does not tolerate more faults in the latter case.
1.2 Paper Organization
We define our model in Section 2, before giving definitions for the various tasks we consider in Section 3. In Section 4 we describe a network-agnostic protocol for the asynchronous common subset (ACS) problem. The ACS protocol is used as a sub-protocol of our main result, the network-agnostic SMR protocol, which is described and analyzed in Section 5.
In Section 6 we prove a lower bound showing that the thresholds we achieve are tight for network-agnostic SMR protocols, even when the protocol may rely on proofs of work. (This improves on the analogous result by Blum et al. , who do not consider proofs of work.) Toward this result, we show that a (network-agnostic) SMR protocol can be used to construct a (network-agnostic) BA protocol with the same thresholds, a result that may be of independent interest.
Setup assumptions and notation. We consider a network of parties
who communicate over point-to-point authenticated channels. We assume that the parties have established a public-key infrastructure prior to the protocol execution. That is, we assume that all parties hold the same vectorof public keys for a digital-signature scheme, and each honest party holds the honestly generated secret key associated with . A valid signature on from is one for which . We treat signatures as ideal (i.e., perfectly unforgeable) for simplicity.
We also implicitly assume that parties use some form of domain separation when signing (e.g., use unique session IDs) to ensure that signatures are valid only for the context in which they are generated, and cannot be used in any other context.
Where applicable, we use to denote the security parameter for a protocol.
Adversarial model. We consider the security of our protocols in the presence of an adversary who can adaptively corrupt some number of parties. The adversary may coordinate the behavior of corrupted parties, and cause them to deviate arbitrarily from the protocol. Note, however, that our claims about adaptive security are only with respect to the property-based definitions found in Section 3, not with respect to a simulation-based definition (cf. [13, 10]). Finally, we assume that the adversary is able to choose corrupted parties’ keys arbitrarily.
Network model. We consider two possible settings for the network. In a synchronous network, all messages are delivered within some known time after they are sent, but the adversary can reorder and delay messages subject to this bound. (As a consequence, the adversary can potentially be rushing, i.e., it can wait to receive all incoming messages in a round before sending its own messages.) In this setting, we also assume all parties begin the protocol at the same time, and that parties’ clocks progress at the same rate. When we say the network is asynchronous, we mean that the adversary can delay messages for an arbitrarily long period of time, though messages must eventually be delivered. We do not make any assumptions on parties’ local clocks in the asynchronous case.
The network is either synchronous or asynchronous for the duration of the protocol (although we stress that the honest parties do not know which is the case).
Although we are ultimately interested in state machine replication, our main protocol relies on various sub-protocols for different tasks. We therefore provide relevant definitions here.
Throughout, when we say that a protocol achieves some property, we mean that it achieves that property with overwhelming probability.
3.1 Useful Sub-Protocols
Throughout this section we consider protocols where, in some cases, parties may not terminate (even upon generating output); for this reason, we mention termination explicitly in the definitions. Honest parties are those who are not corrupted by the end of the execution.
Reliable broadcast. A reliable broadcast protocol allows parties to agree on a value chosen by a designated sender. In contrast to the stronger notion of broadcast, here honest parties might not terminate (but, if so, then none of them terminate).
Definition 1 (Reliable broadcast)
Let be a protocol executed by parties , where a designated party begins holding input and parties terminate upon generating output.
Validity: is -valid if the following holds whenever at most parties are corrupted: if is honest, then every honest party outputs .
Consistency: is -consistent if the following holds whenever at most parties are corrupted: either no honest party terminates, or else all honest parties output the same value .
If is -valid and -consistent, then we say it is -secure.
Byzantine agreement. A Byzantine agreement protocol allows parties who each hold some initial value to agree on an output value. We define a notion of Byzantine agreement that is weaker than usual, in that we do not require parties to terminate upon generating output.
Definition 2 (Byzantine agreement)
Let be a protocol executed by parties , where each party begins holding input .
Validity: is -valid if the following holds whenever at most of the parties are corrupted: if every honest party’s input is equal to the same value , then every honest party outputs .
Consistency: is -consistent if the following holds whenever at most of the parties are corrupted: every honest party outputs the same value .
If is -valid and -consistent, then we say it is -secure.
As an additional property (external to the definition of security), we say that an -party BA protocol is -terminating if it is guaranteed to terminate whenever at most parties are corrupted.
Asynchronous common subset (ACS). Informally, a protocol for the asynchronous common subset (ACS) problem allows parties, each with some input, to agree on a subset of those inputs. (The term “asynchronous” in the name is historical, and one can also consider protocols for this task in the synchronous setting.)
Definition 3 (Acs)
Let be a protocol executed by parties , where each begins holding input , and parties output sets of size at most .
Validity: is -valid if the following holds whenever at most parties are corrupted: if every honest party’s input is equal to the same value , then every honest party outputs .
Liveness: is -live if whenever at most of the parties are corrupted, every honest party produces output.
Consistency: is -consistent if whenever at most parties are corrupted, all honest parties output the same set .
Set quality: has -set quality if the following holds whenever at most parties are corrupted: if an honest party outputs a set , then contains the inputs of at least honest parties.
3.2 State Machine Replication
Protocols for state machine replication (SMR) allow parties to maintain agreement on an ever-growing, ordered sequence of blocks, where a block is a set of values called transactions. An SMR protocol does not terminate but instead continues indefinitely. We model the sequence of blocks output by a party via a write-once array maintained by , each entry (or slot) of which is initially equal to . We say that outputs a block in slot when writes a block to ; if then we refer to as the block output by in slot . We do not require that honest parties output a block in slot before outputting a block in slot .
It is useful to define a notion of epochs
for each party. (We stress that these are not global epochs; instead, each party maintains a local view of its current epoch.) Formally, we assume that each partymaintains a write-once array , each entry of which is initialized to 0. We say enters epoch when it sets , and require:
For , enters epoch before entering epoch .
enters epoch before outputting a block in slot .
An SMR protocol is run in a setting where parties asynchronously receive inputs (i.e., transactions) as the protocol is being executed; each party stores transactions it receives in a local buffer . We imagine these transactions as being provided to parties by some mechanism external to the protocol (which could involve a gossip protocol run among the parties themselves), and make no assumptions about the arrival times of these transactions at any of the parties.
Definition 4 (State machine replication)
Let be a protocol executed by parties who are provided with transactions as input and locally maintain arrays and as described above.
Consistency: is -consistent if the following holds whenever at most parties are corrupted: for all , if an honest party outputs a block in slot then all parties that remain honest output in slot .
Strong liveness: is -live if the following holds whenever at most parties are corrupted: for any transaction for which every honest party received before entering epoch , every party that remains honest outputs a block that contains in some slot .
Completeness: is -complete if the following holds whenever at most parties are corrupted: for all , every party that remains honest outputs some block in slot .
If is -consistent, -live, and -complete, then we say it is -secure.
Our liveness definition is stronger than usual, in that we require a transaction that appears in all honest parties’ buffers by epoch to be included in a block output by each honest party in some slot . (Typically, liveness only requires that each honest party eventually outputs a block containing .) This stronger notion of liveness is useful for showing that SMR implies Byzantine agreement (see Section 6.1), and is achieved by our protocol.
In our definition, a transaction is only guaranteed to be contained in a block output by an honest party if all honest parties receive as input. A stronger definition would be to require this to hold even if only a single honest party receives as input. It is easy to achieve the latter from the former, however, by simply having honest parties gossip all transactions they receive to the rest of the network.
4 An ACS Protocol with Higher Validity Threshold
Throughout this section, we assume an asynchronous network.
Fix with . We show here an ACS protocol that is -secure, and achieves validity even for corruptions. Our construction follows the high-level approach taken by Miller et al. , who devise an ACS protocol based on sub-protocols for reliable broadcast and Byzantine agreement. In our case we need a reliable broadcast protocol that achieves validity for faults, and in Section 4.1 we show such a protocol. We then describe and analyze our ACS protocol in Section 4.2.
4.1 Reliable Broadcast with Higher Validity
In Figure 1, we present a variant of Bracha’s (asynchronous) reliable broadcast protocol  that allows for a more general tradeoff between consistency and validity. Specifically, the protocol is parameterized by a threshold ; for any with , the protocol achieves -consistency and -validity.
If then is -valid.
Assume there are at most corrupted parties, and the sender is honest. All honest parties receive the same value from the sender, and consequently send to all other parties. Since there are at least honest parties, all honest parties receive from at least different parties, and as a result send to all other parties. By the same argument, all honest parties receive from at least parties, and so can output (and terminate).
To complete the proof, we also argue that honest parties cannot output . Note first that no honest party will send for any . Thus, any honest party will receive for some from at most other parties. Since , no honest party will ever send for any . By the same argument, this shows that honest parties will receive for some from at most other parties, and hence cannot output . ∎
Let be such that and . Then is -consistent.
Suppose at most parties are corrupted, and that an honest party outputs . Then must have received messages from at least distinct parties, at least of whom are honest. Thus, all honest parties receive messages from at least distinct parties, and so all honest parties send messages to everyone. It follows that all honest parties receive messages from at least parties, and so can output as well.
To complete the proof, we argue that honest parties cannot output . We argued above that all honest parties send to everyone. Let be the first honest party to do so. Since , that party must have sent in response to receiving messages from at least distinct parties. If some honest outputs then, arguing similarly, some honest party must have received messages from at least distinct parties. But this is a contradiction, since honest parties send only a single message but . ∎
4.2 An ACS Protocol
In Figure 2 we describe an ACS protocol that is parameterized by thresholds , where and . Our protocol relies on two sub-protocols: a reliable broadcast protocol that is -valid and -consistent (such as the protocol from the previous section), and a Byzantine agreement protocol that is -secure. (Since , any asynchronous BA protocol secure for that threshold can be used.) Our ACS protocol runs several executions of these protocols as sub-routines, and so to distinguish between them we denote the th execution by , resp., . We say that the executions correspond to party .
As we will see in the analysis below, is only -live, not terminating (and in fact runs forever). Given that the SMR protocol itself runs indefinitely, it is reasonable to settle for an protocol that runs forever but has bounded communication complexity; we prove that has bounded communication complexity in Lemma 8 below). Likewise, the state for may not be bounded, but since the state for is also unbounded, we consider this acceptable.
Say and at most parties are corrupted. Then if an honest uses input in an execution of , all honest parties receive output from .
The lemma follows from -validity of . ∎
Before proceeding with the analysis, we note that because is -secure even when the network is asynchronous, it remains -consistent and -valid once honest parties cease to participate after seeing become true. In the following lemmata, we will rely on this observation implicitly.
If , then is -valid.
Note that . Say at most parties are dishonest, and all honest parties have the same input . Using Lemma 3, we see that at least executions of (namely, those for which is honest) will result in as output, and so all honest parties can take Exit 1 and output . It is not possible for an honest party to take Exit 1 and output something other than , since . Thus, it only remains to show that if an honest party takes some other exit then it must also output . Consider the two possibilities:
Exit 2: Suppose some honest party takes Exit 2 and outputs . Then, for that party, is true, and so must have seen at least of the terminate with output . Moreover, must have and so . Together, these imply that has seen at least
executions of terminate with output . At least one of those executions must correspond to an honest party. But then Lemma 3 implies that .
Exit 3: Assume an honest party takes Exit 3. Then must have (and so ), must have seen all executions terminate, and must also have seen all executions terminate. Because
a majority of the executions that has seen terminate must correspond to honest parties. Lemma 3 implies that all those executions must have resulted in output . But then must be true for , and it would not have taken Exit 3. ∎
The following two lemmas prove that is -consistent and -live. First, we show that if two honest parties each output a set, then those sets are equal. Then we show that all honest parties do indeed output a set.
Fix with and , and assume at most parties are corrupted. Then if honest parties and output sets and , respectively, in an execution of , it holds that .
We consider different cases based on the possible exits taken by the two honest parties, and show that in all cases their outputs agree.
Case 1: Either or takes Exit 1. Say takes Exit 1 and outputs . (The case where takes Exit 1 is symmetric.) We consider different sub-cases:
takes Exit 1: Say outputs . Then and must have each seen at least executions of output and , respectively. Since , at least one of those executions must be the same. But then -consistency of implies that .
takes Exit 2: Say outputs . For to be satisfied, must have , and must have seen at least
executions of output . As above, must have seen at least executions of output . But since
at least one of those executions must be the same and so -consistency of implies that .
takes Exit 3: We claim this cannot occur. Indeed, if takes Exit 3 then must have (and so ), and must have seen all executions terminate and all executions terminate. Because took Exit 1, must have seen at least executions output , and therefore (by -consistency of ) there are at most executions that has seen terminate with a value other than . The number of executions of that has seen terminate with output is therefore at least , which is strictly greater than the number of executions that has seen terminate with a value other than . But then is true for , and it would not take Exit 3.
Case 2: Neither nor takes Exit 1. We consider two sub-cases:
and both take Exit 2. Say outputs and outputs . Both and must have seen all terminate; by -consistency of they must therefore hold the same . Since holds for , it must have seen a majority of the executions output ; similarly, must have seen a majority of the executions output . Then -consistency of implies .
Either or takes Exit 3. Say takes Exit 3. (The case where takes Exit 3 is symmetric.) As above, and agree on (this holds regardless of whether takes Exit 2 or Exit 3). Since holds for but does not, must have seen all executions terminate but without any value being output by a majority of those executions. But then -consistency of implies that also does not see any value being output by a majority of those executions, and so will not take Exit 2. Since instead must take Exit 3, it must have seen all executions terminate; -consistency of then implies that outputs the same set as . ∎
Fix with and . Then has -set quality.
Consider some honest party . We again consider the various possibilities. Say takes Exit 1 and outputs . Then has seen at least executions terminate with output . Of these, at least must correspond to honest parties. By Lemma 3, those honest parties all had input . This means that contains the inputs of at least honest parties.
Alternatively, say takes Exit 2 or Exit 3 and outputs a set . Then holds , and so . At least
of the indices in correspond to honest parties, and by Lemma 3 for each of those parties the corresponding output value that holds is equal to that party’s input. Thus, regardless of whether takes Exit 2 (and contains the majority value output by ) or Exit 3 (and contains every value output by ), the set output by contains the inputs of at least honest parties. ∎
Fix with and . Then is -live.
Assume at most parties are corrupted during an execution of . We consider two cases: either some honest party takes Exit 1 during this execution, or no honest party ever takes Exit 1 during this execution. In the first case, the first honest party to take Exit 1 must have seen at least executions output with the same value . Hence, -consistency of implies that all other honest parties will eventually see at least those executions output , and will output (if they have not already output via another exit).
In the second case, no honest party ever takes Exit 1. We argue that eventually all honest parties will set and output. If no honest party ever takes Exit 1, then all honest parties continue to participate in any still-running executions indefinitely. At all times such that no honest party has yet set , for all , each honest party has either input 1 or not yet provided input. Such executions are indistinguishable from an execution in which all honest parties have input 1, but some messages have been delayed, and therefore -validity of implies that (so long as no honest parties input 0) these executions eventually output 1. There are now two possibilities: either it continues to be true that no honest parties input 0 to any execution, or some honest party inputs 0 to some execution. In the first case, -validity of implies that at least executions eventually output 1 for all honest parties, and therefore all honest parties set . In the latter case, some honest party must have already set as a result of seeing at least executions output 1. Therefore, by -consistency of , all honest parties will eventually see at least executions output 1, and therefore all honest parties set . Once , each honest party will output as soon as all output. Each such that is guaranteed to eventually output for the following reason: either the sender is honest, and terminates by Lemma 3, or the sender is dishonest, but by -validity of , at least one honest party must have input 1 to as a result of seeing terminate, and therefore eventually everyone sees terminate due to -consistency of .
Fix with and . If is -terminating, then has bounded communication complexity under both of the following conditions:
At most parties are corrupted.
At most parties are corrupted and all honest parties input the same value .
Because has bounded communication complexity, it remains to show that all honest parties eventually stop participating in all executions, either because they terminate or because they set and stop participating in any still-running executions. We consider each condition separately.
Case 1: At most parties are corrupted. We must show that either all executions will eventually terminate, or all honest parties eventually set and stop participating in any still-running executions.
Assume no honest parties take Exit 1 during an execution of . Then all honest parties continue to participate in all executions, and so bounded complexity follows from -termination of .
Now assume some party takes Exit 1 during an execution of . That party must have seen at least executions output with the same value. By -consistency of , all honest parties eventually see those executions output with the same value, and thus can set and stop participating in any still-running executions.
Case 2: At most parties are corrupted and all honest parties input the same value . Because all honest parties input the same value , -validity of implies that all honest parties will eventually set and thus stop participating in any still-running executions.
Fix with and . Then is -secure and -valid.
5 A Network-Agnostic SMR Protocol
In this section, we show our main result: an SMR protocol that is -secure in a synchronous network and -secure in an asynchronous network. We begin in Section 5.1 by constructing a useful sub-protocol for what we call block agreement. We use this to construct an SMR protocol in Section 5.2.
5.1 Block Agreement
Throughout this section, we assume a synchronous network. We use as a shorthand for , where is a valid signature on message signed using ’s secret key.
We define here a notion we call block agreement, and show a block-agreement protocol secure against any corrupted parties. The structure of our protocol is inspired by the synod protocol of Abraham et al. . Block agreement is a form of agreement where (1) in addition to an input, parties provide signatures (in a particular format) on those inputs, and (2) a stronger notion of validity is required. Specifically, consider pairs consisting of a block along with a set of signed buffers . We say a pair is valid if:
contains signed buffers from strictly more than distinct parties.
For each . (Note each buffer can be represented as a bit-vector of length .)
Definition 5 (Block agreement)
Let be a protocol executed by parties , where each party begins holding input and parties terminate upon generating output.
Validity: is -valid if whenever at most of the parties are corrupted, then every honest party that outputs, outputs a -valid pair.
Termination: is -terminating if the following holds when at most of the parties are corrupted: every honest party outputs and terminates with probability .
Consistency: is -consistent if the following holds when at most of the parties are corrupted: if every honest party inputs a -valid pair, there is a such that every honest party outputs .
If is -consistent, and -terminating, then we say it is -secure.
We construct a block-agreement protocol in a modular fashion. We begin by defining a subprotocol (see Figure 3) in which a designated party serves as a proposer. A tuple is called a -vote on if is valid and either:
and is a set of valid signatures from a majority of the parties on messages of the form with (where possibly different can be used in different messages).
When the exact value of is unimportant, we simply refer to the tuple as a vote. A message of the form is a correctly formed message (from party ) if is a vote. A message is a correctly formed message if it contains correctly formed messages from a majority of the parties.
We first show that any two honest parties who generate output in this protocol agree on their output.
If honest parties and output , respectively, in an execution of , then .
If outputs , then must have received a correctly formed message by time that would cause it to output . That message is forwarded by to , and hence either outputs (if it detects an inconsistency) or the same value .
Assume less than half the parties are corrupted. We show that if there is some such that the input of each honest party is a vote of the form , and no honest party ever receives a vote with and , then the only value an honest party can output is .
Assume fewer than parties are corrupted, and that the input of each honest party to is a -vote on . If no honest party ever receives a -vote on with , then every honest party outputs either or .
Consider an honest party who does not output . That party must have received a correctly formed message from , which in turn must contain a correctly formed message from at least one honest party . That message contains a vote and, under the assumptions of the lemma, any other vote contained in with has . It follows that outputs .
Finally, we show that when is honest then all honest parties do indeed generate output.
Assume fewer than parties are corrupted. If every honest party’s input to is a vote and is honest, then every honest party outputs the same valid .
Since every honest party’s input is a vote, the honest will receive at least correctly formed messages, and so sends a correctly formed message to all honest parties. Since is honest, this is the only correctly formed message the honest parties will receive, and so all honest parties will output the same valid .
We now present a protocol that uses to achieve a form of graded consensus on a valid pair . (See Figure 4.) As in the protocol of Abraham et al. , we rely on an atomic leader-election mechanism with the following properties: On input from a majority of parties, chooses a uniform leader and sends to all parties. This ensures that if less than half of all parties are corrupted, then at least one honest party must call with input before the adversary can learn the identity of . A leader-election mechanism tolerating any faults can be realized (in the synchronous model with a PKI) based on general assumptions ; it can also be realized more efficiently using a threshold unique signature scheme.
Below, we refer to a message as a correctly formed message (from on ) if is valid. We refer to a message as a correctly formed message on if is valid and is a set of valid signatures on from more than parties; in that case, is called a -certificate for .
For an output , we refer to