No Need for Recovery: A Simple Two-Step Byzantine Consensus

11/23/2019
by   Tung-Wei Kuo, et al.
National Chengchi University
0

In this paper, we give a deterministic two-step Byzantine consensus protocol that achieves safety and liveness. A two-step Byzantine consensus protocol only needs two communication steps to commit in the absence of faults. Most two-step Byzantine consensus protocols exploit optimism and require a recovery protocol in the presence of faults. In this paper, we give a simple two-step Byzantine consensus protocol that does not need a recovery protocol.

READ FULL TEXT VIEW PDF

Authors

page 1

page 2

page 3

page 4

02/22/2019

Revisiting hBFT: Speculative Byzantine Fault Tolerance with Minimum Cost

FaB Paxos[5] sets a lower bound of 5f + 1 replicas for any two-step cons...
09/29/2021

Fast B4B: Fast BFT for Blockchains

Low latency is one of the desired properties for partially synchronous B...
04/19/2022

Basilic: Resilient Optimal Consensus Protocols With Benign and Deceitful Faults

The problem of Byzantine consensus has been key to designing secure dist...
11/20/2019

Robustness and efficiency of leaderless probabilistic consensus protocols within Byzantine infrastructures

This paper investigates leaderless binary majority consensus protocols w...
06/08/2022

Authenticated Byzantine Gossip Protocol

ABGP refers to Authenticated Byzantine Gossip Protocol. The ABGP is a pa...
05/15/2019

Byzantine Consensus in the Common Case

Modular methods to transform Byzantine consensus protocols into ones tha...
07/01/2020

The Hermes BFT for Blockchains

The performance of partially synchronous BFT-based consensus protocols i...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

We consider the Byzantine agreement problem. Let and be the number of nodes (e.g., processors or replicas) and the number of faulty nodes, respectively. In this problem, each node has an initial value, and nodes exchange messages to reach an agreement. Specifically, we need to design a message exchange protocol (or consensus algorithm) so that after the protocol terminates, all the non-faulty nodes output (or commit) the same value, and this value is the initial value of some node. In other words, the consensus algorithm must guarantee safety. Moreover, the protocol must terminate eventually, i.e., guarantee liveness. In this paper, we consider the partially synchronous model. Specifically, let be the transmission delay of a message sent at time . In the partially synchronous model, does not grow faster than indefinitely.

Our goal is to design a two-step consensus algorithm. In one step, a node can 1) send messages, 2) receive messages, and 3) do local computation, in that order [7]. A consensus algorithm is two-step if all non-faulty nodes can commit after two steps in the absence of faults. It has been shown that to solve the Byzantine agreement problem by a two-step consensus algorithm, must hold [6]. Thus, in this paper, we assume .

Several two-step consensus algorithms have been proposed to solve the Byzantine agreement problem [6, 5, 4]. These solutions proceed in rounds, and a round consists of two steps in normal operation. However, it has been pointed out that FaB [6] and Zyzzyva [5] cannot guarantee both safety and liveness [1]. Moreover, these consensus algorithms exploit optimism in their design and invoke additional recovery protocols when normal operation fails [6, 5, 4]. Thus, these solutions may need more than two steps in a round due to faulty behavior or long communication delay.

In this paper, we give a simple two-step consensus algorithm without the use of any recovery protocol. In our solution, a round always consists of only two steps: 1) a leader, which is chosen in a round-robin fashion, broadcasts a proposal, and 2) all nodes vote and collect votes. Our solution and analysis are inspired by MSig-BFT, which is a three-step protocol [3]. Like MSig-BFT, the leader may not be allowed to broadcast a proposal if the network is in bad condition. An interesting property of our solution is that nodes may reach consensus in a round, even if the leader chosen in that round is faulty or suffers from long transmission delay. Such a property can thus mitigate the harm caused by faulty nodes and transmission delay.

2 The Two-Step Consensus Algorithm

The first step: propose. At the beginning of round , the leader sends a Proposal message, which contains a candidate value and the current round , to all nodes. We will describe this step in detail after the next step is introduced. For a Proposal message , and denote the round in which is generated and the candidate value contained in . The pseudocode of the first step is given in Algorithm 1.

Input: The initial value of and
1 The leader of round
2 if  then
3       Construct a Proposal message
4      
       /* Determine and send to all nodes */
5       if  then
6             and send to all nodes
7            
8      else if  then
9             if  then
10                   Let be any value satisfying the above constraint
11                   and send to all nodes
12                  
13            else
14                   and send to all nodes
15                  
16            
17      
Algorithm 1 Propose: from the viewpoint of node

The second step: vote. Once a node receives a valid Proposal message , then broadcasts a Vote message containing candidate value . Note that a node broadcasts at most one Vote message in a round. If receives Vote messages before a predetermined timeout expires, and these Vote messages contain the same non-empty candidate value , then commits . On the other hand, if cannot commit before expires, then goes to the next round, and needs to store the candidate value for which it votes. Specifically, let be the candidate value contained in the Vote message broadcast by in round . In round , if cannot receive a valid Proposal message before another predetermined timeout expires, then broadcasts a Vote message containing . Note that in the first round (i.e., ), if cannot receive a valid Proposal message before expires, broadcasts a Vote message containing an empty candidate value . To achieve liveness in the partially synchronous model, whenever a node goes to the next round, the lengths of the two timeouts are doubled. For a Vote message , we use and to denote the round in which is generated and the candidate value contained in . We summarize this step from the viewpoint of node in Algorithm 2.

1 Construct a Vote message , and set
/* Determine and broadcast */
2 if   then
3       and broadcast
4      
5else
6       if  then
7            
8            
9      else
10             the Vote message that broadcast in round
11            
12            
13      Broadcast
14      
15if   then
16       Commit
17      
18else
19       Go to round
20      
Algorithm 2 Vote: from the viewpoint of node

The complete description of the first step: Let be the set of Vote messages of round received by node . is valid if it contains Vote messages of round . We now describe the first step from the viewpoint of in detail. Let be the current round. If is not the leader of round , then goes to the second step, i.e., voting. Otherwise, if is the leader, ’s action depends on whether or .

Case 1 : constructs a Proposal message , where is the initial value of and .

Case 2 : In this case, if wants to send a Proposal message , must have a valid lockset of the previous round, i.e., . Otherwise, if , then there is no Proposal message in round . For other nodes to verify this condition, the Proposal message must contain . We further impose a constraint on . If at least votes in contain the same candidate value and , then . Note that if multiple candidate values satisfy the constraint, then the leader can choose any one of them. Otherwise, if no candidate value satisfies the constraint, can propose its own initial value.

3 Analysis

3.1 Proof of Safety

To prove that our solution guarantees safety, it suffices to prove the following two claims. We say a node votes for a value in round if the node sends a Vote message containing in round .

Claim 1.

If two non-faulty nodes and commit values and in the same round , respectively, then .

Proof.

For the sake of contradiction, assume that . (respectively, ) receives Vote messages containing (respectively, ). Hence, in round , at least non-faulty nodes vote for and a different set of at least non-faulty nodes vote for . Thus, there are at least nodes, which is a contradiction. ∎

Claim 2.

Once a non-faulty node commits value in round , for any future round , only can be committed in round .

Proof.

Let be the set of nodes that vote for in round . Because is committed in round , . Let be the subset of that contains non-faulty nodes only. Thus, . Observe that if the leader of round has a valid lockset of round , then must contain at least Vote messages sent from . In addition, for each candidate value , at most Vote messages in contain . Thus, if the leader of round can send a Proposal message , must hold. Otherwise, if there is no Proposal message of round , all nodes in still vote for the value that they vote for in round , i.e., . In both cases, all nodes in still vote for in round . The claim then follows by induction. ∎

3.2 Proof of Liveness Under the Partially Synchronous Model

A standard technique to guarantee liveness under the partially synchronous model is to double the lengths of the timeouts (e.g., and ) whenever entering a new round [2]. It can be shown that there is some round such that for any round , all non-faulty nodes can receive messages from each other before the timeouts expire [2]. Thus, in some round , the leader is non-faulty111Recall that the leader is chosen in a round-robin fashion. Hence, in some round , the leader is non-faulty. and has a valid lockset of round . Thus, there must be a valid Proposal message in round . All non-faulty nodes then vote for in round . Since all these Vote messages can be received in time, all non-faulty nodes can commit in round .

References