How Does Blockchain Security Dictate Blockchain Implementation?

09/10/2021
by   Andrew Lewis-Pye, et al.
LSE
0

Blockchain protocols come with a variety of security guarantees. For example, BFT-inspired protocols such as Algorand tend to be secure in the partially synchronous setting, while longest chain protocols like Bitcoin will normally require stronger synchronicity to be secure. Another fundamental distinction, directly relevant to scalability solutions such as sharding, is whether or not a single untrusted user is able to point to *certificates*, which provide incontrovertible proof of block confirmation. Algorand produces such certificates, while Bitcoin does not. Are these properties accidental? Or are they inherent consequences of the paradigm of protocol design? Our aim in this paper is to understand what, fundamentally, governs the nature of security for permissionless blockchain protocols. Using the framework developed in (Lewis-Pye and Roughgarden, 2021), we prove general results showing that these questions relate directly to properties of the user selection process, i.e., the method (such as proof-of-work or proof-of-stake) which is used to select users with the task of updating state. Our results suffice to establish, for example, that the production of certificates is impossible for proof-of-work protocols, but is automatic for standard forms of proof-of-stake protocols. As a byproduct of our work, we also define a number of security notions and identify the equivalences and inequivalences among them.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

09/20/2020

A General Framework for the Security Analysis of Blockchain Protocols

Blockchain protocols differ in fundamental ways, including the mechanics...
08/12/2021

Sharding-Based Proof-of-Stake Blockchain Protocols: Security Analysis

Blockchain technology has been gaining great interest from a variety of ...
06/18/2020

Resource Pools and the CAP Theorem

Blockchain protocols differ in fundamental ways, including the mechanics...
11/11/2019

Just Enough Security: Reducing Proof-of-Work Ecological Footprint

Proof-of-work (PoW) mechanisms secure about 80% of the $250B cryptocurre...
10/30/2019

Breaking and (Partially) Fixing Provably Secure Onion Routing

After several years of research on onion routing, Camenisch and Lysyansk...
01/10/2019

Redactable Blockchain in the Permissionless Setting

Bitcoin is an immutable permissionless blockchain system that has been e...
05/13/2018

PoW, PoS, & Hybrid protocols: A Matter of Complexity?

In a previous paper, it was discussed whether Bitcoin and/or its blockch...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1. Introduction

Paradigms for blockchain protocol design.

In the wake of Bitcoin (Nakamoto et al., 2008), thousands of cryptocurrencies have flooded the market. While many of these currencies use only slight modifications of the Bitcoin protocol, there are also a range of cryptocurrencies taking radically different design approaches. Two informal distinctions are between:

  1. Proof-of-stake (PoS)/proof-of-work (PoW). In a PoW protocol, users are selected and given the task of updating state, with the probability any particular user is chosen being proportional to their (relevant) computational power. In PoS protocols, users are selected with probability depending on their stake (owned currency).

  2. BFT222The acronym BFT stands for ‘Byzantine-Fault-Tolerant’./longest-chain. As well as being a PoW protocol, Bitcoin is the best known example of a longest chain protocol. This means that forks may occur in the blockchain, but that honest miners will build on the longest chain. In a BFT protocol, on the other hand, users are selected and asked to carry out a consensus protocol designed for the permissioned setting. So, roughly, longest chain protocols are those which are derived from Bitcoin, while BFT protocols are derived from protocols designed in the permissioned setting. Algorand (Chen and Micali, 2016) is a well known example of a BFT protocol.

A formal framework for comparing design paradigms (Lewis-Pye and Roughgarden, 2021).

While informal, these distinctions are more than aesthetic. For example, BFT protocols like Algorand will tend to give security guarantees that hold under significantly weaker network connectivity assumptions than are required to give security for protocols like Bitcoin. By developing an appropriate formal framework, it can then be shown (Lewis-Pye and Roughgarden, 2021) that these differences in security are a necessary consequence of the paradigm of protocol design: The fact that Bitcoin is a PoW protocol means that it cannot offer the same flavour of security guarantees as Algorand. A framework of this kind was developed in (Lewis-Pye and Roughgarden, 2021), according to which permissionless 333In the distributed computing literature, consensus protocols have traditionally been studied in a setting where all participants are known to each other from the start of the protocol execution. In the parlance of the blockchain literature, this is referred to as the permissioned setting. What differentiates Bitcoin (Nakamoto et al., 2008) from these previously studied protocols is that it operates in a permissionless setting, i.e. it is a protocol for establishing consensus over an unknown network of participants that anybody can join, with as many identities as they like in any role. protocols run relative to a resource pool. This resource pool specifies a balance for each user over the duration of the protocol execution (such as hashrate or stake), which may be used in determining which users are permitted to update state. Within this framework, the idea that protocols like Bitcoin require stronger connectivity assumptions for security can be formalised as a theorem asserting that adaptive protocols cannot be partition secure – these terms apply to permissionless blockchain protocols and will be defined formally later on, but, roughly, they can be summed up as follows:

  • Liveness and security are defined in terms of a notion of confirmation for blocks. A protocol is live if the number of confirmed blocks can be relied on to increase during extended intervals of time during which message delivery is reliable. A protocol is secure if rollback on confirmed blocks is unlikely.

  • Bitcoin being adaptive means that it remains live in the face of an unpredictable size of resource pool (unpredictable levels of mining).

  • A protocol is partition secure if it is secure in the partially synchronous setting, i.e. if the rollback of confirmed blocks remains unlikely even in the face of potentially unbounded network partitions. The partially synchronous setting will be further explained and formally defined in Section 2.

This paper: certificates.

The way in which Algorand and other BFT protocols achieve partition security is also worthy of note. For all such protocols, protection against unbounded network partitions is provided through the production of certificates: These are sets of broadcast messages whose very existence suffices to establish block confirmation and which cannot be produced by a (suitably bounded) adversary given the entire duration of the execution of the protocol. Bitcoin does not produce certificates, because the existence of a certain chain does generally not prove that it is the longest chain – a user will only believe that a certain chain is the longest chain until presented with a longer (possibly incompatible) chain. Algorand does produce certificates, on the other hand, because the very existence of a valid chain, together with appropriate committee signatures for all the blocks in the chain, suffices to guarantee (beyond a reasonable doubt) that the blocks in that chain are confirmed. We will formally define what it means for a protocol to produce certificates in Section 3.

The production of certificates is also functionally useful, beyond providing security against network partitions. The production of certificates means, for example, that a single untrusted user is able to convince another user of block confirmation (by relaying an appropriate certificate), and this is potentially very useful in the context of sharding. If a user wishes to learn the state of a blockchain they were not previously monitoring, then it is no longer necessary to perform an onboarding process in which one samples the opinions of users until such a point that it is likely that at least one of them was ‘honest’ – one simply requests a certificate proving confirmation for a recently timestamped block.444Such techniques can avoid the need to store block hashes in a sharding ‘main chain’, and the information withholding attacks that come with those approaches.

1.1. Overview of results.

The goal of this paper is to rigorously investigate to what extent today’s protocols “have to look the way they are” given the security guarantees they achieve. Such formal analyses are relevant to the broader research community for several reasons, including: (i) accurate intuitions of the community (e.g., that there’s fundamentally only one way to achieve certain properties) can be formally validated, with the necessary assumptions clearly spelled out; (ii) inaccurate intuitions can be exposed as such; (iii) unexplored areas of the protocol design space can naturally rise to the surface (e.g., Section 5.2); and (iv) new definitions (e.g., certificates) can enhance our language for crisply describing and comparing competing solutions (both present and future). In this paper, we prove three main results, which each address this issue in a different setting.

The partially synchronous setting. The first key question is:

  1. Are certificates fundamental to partition security, or an artifact of Algorand’s specific implementation? That is, are certificates the only way for permissionless blockchain protocols to achieve security in the partially synchronous setting?

Our first main result, proved in the context of the framework of (Lewis-Pye and Roughgarden, 2021), gives an affirmative response to Q1. Of course, all terms will be explained and formally defined in later sections.

THEOREM 3.3. If a permissionless blockchain protocol is secure in the partially synchronous setting, then it produces certificates.

Since it will be easily observed that the production of certificates implies security, Theorem 3.3 shows that, in the partially synchronous setting, the production of certificates is actually equivalent to security.

The synchronous setting. What about Bitcoin? While Bitcoin does not satisfy the conditions of Theorem 3.3, it clearly has some non-trivial security. The standard formalisation in the literature (Ren, 2019; Garay et al., 2018) is that Bitcoin is secure in the synchronous setting, for which there is an upper bound on message delivery time.555The synchronous setting will be further explained and formally defined in Section 2. Even working in the synchronous setting, though, it is clear that Bitcoin does not produce certificates. Again, we are led to ask whether this is a necessary consequence of the paradigm of protocol design:

  1. Could there be a Bitcoin-like protocol that, at least in the synchronous setting, has as strong a security guarantee in terms of the production of certificates as BFT-type protocols do in the partially synchronous setting?

The answer depends on key features of the resource pool – recall that the resource pool specifies a balance for each user over the duration of the protocol execution, such as hashrate or stake. The crucial distinction here is between scenarios in which the size of the resource pool is known (e.g. PoS), and scenarios where the size of the resource pool is unknown (e.g. PoW). As per the framework in (Lewis-Pye and Roughgarden, 2021), we will refer to these as the sized and unsized settings, respectively – formal definitions will be given in Section 5. As alluded to above, we define a protocol to be adaptive if it is is live in the unsized setting, and it was shown in (Lewis-Pye and Roughgarden, 2021) that adaptive protocols cannot be secure in the partially synchronous setting.

The synchronous and unsized setting. The term “non-trivial adversary”, which is used in Theorem 5.1 below, will be defined in Section 5 so as to formalise the idea that the adversary may have at least a certain minimum resource balance throughout the execution. With these basic definitions in place, we can then give a negative answer to Q2.

THEOREM 5.1 Consider the synchronous and unsized setting. If a permissionless blockchain protocol is live then, in the presence of a non-trivial adversary, it does not produce certificates.

So, while Theorem 3.3 showed that the production of certificates is necessary in the partially synchronous setting, Theorem 5.1 shows that the production of certificates isn’t possible in the unsized setting (in which PoW protocols like Bitcoin operate). Following on from our previous discussion regarding the relevance of certificates to sharding, one direct application of this result is that it rules out certain approaches to sharding for PoW protocols.

The synchronous and sized setting. In the sized setting (such as for PoS protocols), though, it is certainly possible for protocols to produce certificates. It therefore becomes a natural question to ask how far we can push this:

  1. Does the production of certificates come down purely to properties of the process of user selection? Is it simply a matter of whether one is in the sized or unsized setting?

Our final theorem gives a form of positive response to Q3. We state an informal version of the theorem below. A formal version will be given in Section 5.

THEOREM 5.6 (INFORMAL VERSION). Consider the synchronous and sized setting, and suppose a permissionless blockchain protocol is of ‘standard form’. Then there exists a ‘recalibration’ of the protocol which produces certificates.

Theorem 5.6 says, in particular, that all ‘standard’ PoS protocols can be tweaked to get the strongest possible security guarantee, since being of ‘standard form’ will entail satisfaction of a number of conditions that are normal for such protocols. Roughly speaking, one protocol will be considered to be a recalibration of another if running the former just involves running the latter for a computable transformation of the input parameters and/or using a different notion of block confirmation. The example of Snow White (Bentov et al., 2016) may be instructive here (for the purposes of this example, the particulars of the Snow White protocol are not important – all that matters is that, at a high level, Snow White might be seen as a PoS version of Bitcoin, but with the fundamental differences that it operates in the sized setting, and that blocks have non-manipulable timestamps). Snow White is a PoS longest chain protocol, and it is not difficult to see that, with the standard notion of confirmation, it does not produce certificates – an adversary can produce chains of blocks which are not confirmed, but which would be considered confirmed in the absence of other blocks which have been broadcast. So whether a block is confirmed depends on the whole set of broadcast messages. On the other hand, it is also not difficult to adjust the notion of confirmation so that Snow White does produce certificates. An example would be to consider a block confirmed when it belongs to a long chain of sufficient density (meaning that it has members corresponding to most possible timeslots) that it could not likely be produced by a (sufficiently bounded) adversary. We will see further examples like this explained in greater depth in Section 5. Theorem 5.6 implies much more generally that PoS protocols can always be modified so as to produce certificates in this way.

The punchline.

Whether or not a permissionless blockchain protocol produces certificates comes down essentially to whether one is working in the sized or unsized setting (e.g. whether the protocol is PoS or PoW). This follows from the following results that we described above:

  1. According to the results of (Lewis-Pye and Roughgarden, 2021), only protocols which work in the sized setting can be secure in the partially synchronous setting. According to Theorem 3.3, all such protocols produce certificates.

  2. Theorem 5.1 tells us that, in the synchronous and unsized setting, protocols cannot produce certificates.

  3. Theorem 5.6 tells us that all standard protocols in the sized and synchronous setting can be recalibrated to produce certificates.

1.2. Related work

There are a variety of papers from the distributed computing literature that analyse settings somewhere between the permissioned and permissionless settings as considered here. In (Okun, 2005), for example, Okun considered a setting which a fixed number of processors communicate by private channels, where each processor may or may not have a unique identifier, and where processors may or may not be ‘port aware’, i.e. be able to tell which channel a message arrives from. A number of papers (Cavin et al., 2004; Alchieri et al., 2008) have also considered the problem of reaching consensus amongst unknown participants (CUP). In the framework considered in those papers, the number and the identifiers of other participants may be unknown from the start of the protocol execution. A fundamental difference with the permissionless setting considered here is that, in the CUP framework, all participants have a unique identifier and the adversary is unable to obtain additional identifiers to be able to launch a sybil attack against the system, i.e. the number of identifiers controlled by the adversary is bounded.

The Bitcoin protocol was first described in 2008 (Nakamoto et al., 2008). Since then, a number of papers (Garay et al., 2018; Pass et al., 2016) have developed frameworks for the analysis of Bitcoin in which oracles are introduced for modelling PoW. A more general form of oracle is required for modelling PoS and other forms of permissionless protocol, however. In (Lewis-Pye and Roughgarden, 2021) a framework was introduced that described a generalised form for such oracles. We use that framework in this paper, but also develop that framework in Sections 2.4, 2.5, 2.7, 2.8 and 4.3 to be appropriate specifically for the analysis of blockchain protocols.

2. The Framework

We work within the framework of (Lewis-Pye and Roughgarden, 2021). While we describe the framework in its entirety here, we refer the reader to the the original paper for further examples and explanations of the framework set-up. Within Section 2, it is the definitions of Sections 2.4, 2.5, 2.7 and 2.8 that are new to this paper (all definitions of Sections 3, 4 and 5 are also new to this paper).

Most of this section can be briefly summed up as follows – all undefined terms in the below will be formalised and defined in later subsections.

  • Protocols are executed by an unknown number of users, each of which is formalised as a deterministic processor that controls a set of public keys.

  • Processors have the ability to broadcast messages to all other processors. The duration of the execution, however, may be divided into synchronous or asynchronous intervals. During asynchronous intervals, an adversary can tamper with message delivery as they choose. During synchronous intervals there is a given upper bound on message delivery time. We then distinguish two synchronicity settings. In the synchronous setting it is assumed that there are no asynchronous intervals, while in the partially synchronous setting there may be unpredictably long asynchronous intervals.

  • Amongst all broadcast messages, there is a distinguished set referred to as blocks, and one block which is referred to as the genesis block. Unless it is the genesis block, each block has a unique parent block.

  • To blackbox the process of user selection, whereby certain users are selected and given the task of updating state, (Lewis-Pye and Roughgarden, 2021) introduces two new notions: (1) Each public key is considered to have a certain resource balance, which may vary over the execution, and; (2) The protocol will also be run relative to a permitter oracle, which may respond to this resource balance. For a PoW protocol like Bitcoin, the resource balance of each public key will be their (relevant) computational power at the given timeslot.

  • It is the permitter oracle which then gives permission to broadcast messages updating state. To model Bitcoin, for example, we sometimes have the permitter allow another user to broadcast a new block, with the probability this happens for each user being proportional to their resource balance.

  • Liveness and security are defined in terms of a notion of confirmation for blocks. Roughly, a protocol is live if the number of confirmed blocks can be relied on to increase during extended intervals of time during which message delivery is reliable. A protocol is secure if rollback on confirmed blocks is unlikely.

2.1. The computational model

Overview. There are a number of papers analysing Bitcoin (Garay et al., 2018; Pass et al., 2016) that take the approach of working within the language of the UC framework of Canetti (Canetti, 2001). Our position is that this provides a substantial barrier to entry for researchers in blockchain who do not have a strong background in security, and that the power of the UC framework remains essentially unused in the subsequent analysis. Instead, we use a very simple computational model, which is designed to be as similar as possible to standard models from distributed computing (e.g. (Dwork et al., 1988)), while also being adapted to deal with the permissionless setting. We thus consider an information theoretic model in which processors are simply specified by state transition diagrams. A permitter oracle is introduced as a generalisation of the random oracle functionality in the Bitcoin Backbone paper (Garay et al., 2018): It is the permitter oracle’s role to grant permissions to broadcast messages. The duration of the execution is divided into timeslots. Each processor enters each timeslot in a given state , which determines the instructions for the processor in that timeslot – those instructions may involve broadcasting messages, as well as sending requests to the permitter oracle. The state of the processor at the next timeslot is determined by the state , together with the messages and permissions received at .

Since we focus on impossibility results, we simplify the presentation by making the assumption that we are always working in the authenticated setting, in which processors have access to public/private key pairs. This assumption is made purely for the sake of simplicity, and the results of the paper do not depend upon it.

Formal description. For a list of commonly used variables and terms, see Table 1 in the appendix. We consider a finite666In (Lewis-Pye and Roughgarden, 2021), a potentially infinite number of processors were allowed, but each processor was given a single public key (identifier). Here, we will find it convenient to consider instead a finite number of processors, each of which may control an unbounded number of public keys. system of processors. Each processor is specified by a state transition diagram, for which the number of states may be infinite. Amongst the states of a processor are a non-empty set of possible initial states. The inputs to determine which initial state it starts in. If a variable is specified as an input to , then we refer to it as determined for , referring to the variable as undetermined for otherwise. If a variable is determined/undetermined for all , we simply refer to it as determined/undetermined. Amongst the inputs to is an infinite set of public keys, which are specific to in the sense that if and then when . A principal difference between the permissionless setting (as considered here) and the permissioned setting (as studied in classical distributed computing) is that, in the permissionless setting, the number of processors is undetermined, and is undetermined for when .

Processors are able to broadcast messages. To model permissionless protocols, such as Bitcoin, in which each processor has limited ability to broadcast new blocks (and possibly other messages), we require any message broadcast by to be permitted for some public key in : The precise details are as follows. We consider a real-time clock, which exists outside the system and measures time in natural number timeslots. The duration is a determined variable that specifies the set of timeslots (an initial segment of the natural numbers) at which processors carry out instructions. At each timeslot , each processor receives a pair , where either or both of and may be empty. Here, is a finite set of messages (i.e. strings) that have previously been broadcast by other processors. We refer to as the message set received by at , and say that each message is received by at timeslot . is referred to as the permission set received by at . Formally, is a set of pairs, where each pair is of the form such that and is a potentially infinite set of messages. If , then receipt of the permission set means that each message may now be permitted for . This is complicated slightly by our need to model the authenticated setting within an information theoretic model – we do this by declaring that only is permitted to broadcast messages signed by keys in . More precisely, is permitted for if the following conditions are also satisfied:

  • is of the form – thought of as ‘the message signed by ’.

  • For any ordered pair of the form

    contained in (i.e. which is a substring of) , either , or else is contained in a message that has been received by .

So, as suggested in the above, the latter bulleted conditions allow us to model the fact that we work in the authenticated setting (i.e. we assume the use of digital signatures) within an information theoretic computational model.

To complete the instructions for timeslot , then broadcasts a finite set of messages , each of which must be permitted for some , makes a request set , and then enters a new state , where and are determined by the present state and , according to the state transition diagram. The form of the request set will be described shortly, together with how determines the permission set received at by at the next timeslot.

An execution is described by specifying the set of processors, the duration, the initial states for all processors and by specifying, for each timeslot :

  1. The messages and permission sets received by each processor;

  2. The instruction that each processor executes, i.e. what messages it broadcasts, what requests it makes, and the new state it enters.

We require that each message is received by at most once for each time it is broadcast, i.e. at the end of the execution it must be possible to specify an injective function mapping each pair , such that is received by at timeslot , to a triple , such that , and such that broadcast at .

2.2. The resource pool and the permitter

Informal Motivation. Who should be allowed to create and broadcast new Bitcoin blocks? More broadly, when defining a permissionless protocol, who should be able to broadcast new messages? For a PoW protocol, the selection is made depending on computational power. PoS protocols are defined in the context of specifying how to run a currency, and select public keys according to their stake in the given currency. More generally, one may consider a scarce resource, and then select public keys according to their corresponding resource balance. In (Lewis-Pye and Roughgarden, 2021), a framework was introduced according to which protocols run relative to a resource pool, which specifies a resource balance for each public key over the duration of the execution. The precise way in which the resource pool is used to determine public key selection is then black boxed through the use of the permitter oracle, to which processors can make requests to broadcast, and which will respond depending on their resource balance. To model Bitcoin, for example, one simply allows each public key to make one request to broadcast a block at each timeslot. The permitter oracle then gives a positive response with probability depending on their resource balance, which in this case is defined by hashrate. So, this gives a straightforward way to model the process, without the need for a detailed discussion of hash functions and how they are used to instantiate the selection process.

Formal specification. At each timeslot , we refer to the set of all messages that have already been received or broadcast by as the message state of . Each execution happens relative to a (determined or undetermined) resource pool,777As described more precisely in Section 2.6, whether the resource pool is determined or undetermined will decide whether we are in the sized or unsized setting. which in the general case is a function , where is the set of all public keys, is the duration and is the set of all possible sets of messages. can be thought of as specifying the resource balance of each public key at each timeslot, possibly relative to a given message state. For each and , we suppose that certain basic conditions are satisfied:

  1. If then for some processor ;

  2. There are finitely many for which , and;

  3. .

Suppose that, after receiving messages and a permission set at timeslot , ’s message state is , and that is the set of all messages that are permitted for (i.e. for some ). We consider two settings – the timed and untimed settings. The form of each request made by at timeslot depends on the setting, as specified below. While the following definitions might initially seem abstract, shortly we will give examples to make things clear.

  • The untimed setting. Here, each request made by must be of the form , where , , and where is some (possibly empty) extra data. The permitter oracle will respond with a pair , where is a set of strings that may be empty. The value of will be assumed to be a probabilistic function of the determined variables, , and of , subject to the condition that if . If modelling Bitcoin, for example, might be a set of blocks that have been received by , or that is already permitted to broadcast, while specifies a new block extending the ‘longest chain’ in . If the block is valid, then the permitter oracle will give permission to broadcast it with probability depending on the resource balance of at time . We will expand on this example below.

  • The timed setting. Here, each request made by must be of the form , where is a timeslot, and where , and are as in the untimed setting, The response of the permitter oracle will be assumed to be a probabilistic function of the determined variables, , and of , subject to the condition that if .

The permission set received by at timeslot is the set all of responses from the permitter oracle to ’s requests at timeslot .

To understand these definitions, it is instructive to consider how they can be used to give a simple model for Bitcoin. To do so, we work in the untimed setting, and we define the set of possible messages to be the set of possible blocks. For each , we then allow to make a single request of the form at each timeslot. As mentioned above, will be a set of blocks that have been received by , or that is already permitted to broadcast. The entry will be data (without PoW attached) that specifies a block extending the ‘longest chain’ in . If specifies a valid block, then the permitter oracle will give permission to broadcast the block specified by with probability depending on the resource balance of at time (which is determined by hashrate, and is independent of ). So, the higher ’s resource balance at a given timeslot, the greater the probability will be able to mine a block at that timeslot. Of course, a non-faulty processor will always submit requests of the form , for which is ’s (entire) message state, and such that specifies a valid block extending the longest chain in .888So, in this simple model, we don’t deal with any notion of a ‘transaction’. It is clear, though, that the model is sufficient to be able to define what it means for blocks to be confirmed, to define notions of liveness (roughly, that the set of confirmed blocks grows over time with high probability) and security (roughly, that with high probability, the set of confirmed blocks is monotonically increasing over time), and to prove liveness and security for the Bitcoin protocol in this model (by importing existing proofs, such as that in (Garay et al., 2018)).

The motivation for considering the timed as well as the untimed setting stems from one of the qualitative differences between PoS and PoW protocols. PoS protocols are best modelled in the timed setting, where processors can look ahead to determine their permission to broadcast at future timeslots (when their resource balance may be different than it is at present), i.e. with PoS protocols, blocks will often have timestamps that cannot be manipulated, and at a given timeslot, a processor may already be able to determine that they have permission to broadcast blocks with a number of different future timestamps. This means that, when modelling PoS protocols, processors have to be able to make requests corresponding to timeslots other than the current timeslot . We will specify further differences between the timed and untimed settings in Section 2.6.

By a permissionless protocol we mean a pair , where is a state transition diagram to be followed by all non-faulty processors, and where is a permitter oracle, i.e. a probabilistic function of the form described for the timed and untimed settings above. It should be noted that the roles of the resource pool and the permitter oracle are different, in the following sense: While the resource pool is a variable (meaning that a given protocol will be expected to function with respect to all possible resource pools consistent with the setting999Generally, protocols will be considered in a setting that restricts the set of resource pools in certain ways, such as limiting the resource balance of the adversary.), the permitter is part of the protocol description.

2.3. The adversary and the synchronous and partially synchronous settings

While all non-faulty processors follow the state transition diagram specified for the protocol, we allow a single undetermined processor to display Byzantine faults, and we think of as being controlled by the adversary: In formal terms, the difference between and other processors is that the state transition diagram for might not be . Placing bounds on the power of the adversary means limiting their resource balance (since is infinite, it does not limit the adversary that they control a single processor). For , we say the adversary is -bounded if their total resource balance is always at most a fraction of the total, i.e. for all , .

It is standard in the distributed computing literature (Lynch, 1996) to consider a variety of synchronous, partially synchronous, or asynchronous settings, in which message delivery might be reliable or subject to various forms of failure. We will suppose that the duration is divided into intervals that are labelled either synchronous or asynchronous (meaning that each timeslot is either synchronous or asynchronous). We will suppose that during asynchronous intervals messages can be arbitrarily delayed or not delivered at all. During synchronous intervals, however, we will suppose that messages are always delivered within many timeslots. So if , is broadcast by at , if and is a synchronous interval contained in , then will receive by timeslot . Here is a determined variable.

We then distinguish two synchronicity settings. In the synchronous setting it is assumed that there are no asynchronous intervals during the duration, while in the partially synchronous setting there may be undetermined asynchronous intervals.

It will be useful to consider the notion of a timing rule, by which we mean a partial function mapping tuples of the form to timeslots. We say that an execution follows the timing rule if the following holds for all processors and : We have that receives at iff there exists some and such that broadcasts the message at and . We restrict attention to timing rules which are consistent with the setting. Since protocols will be expected to behave well with respect to all timing rules consistent with the setting, it will sometimes be useful to think of the adversary as also having control over the choice of timing rule.

2.4. The structure of the blockchain

Amongst all broadcast messages, there is a distinguished set referred to as blocks, and one block which is referred to as the genesis block. Unless it is the genesis block, each block has a unique parent block , which must be uniquely specified within the block message. Each block is signed and broadcast by a single key, , but may contain other broadcast messages which have been signed and broadcast by other keys. No block can be broadcast by the processor that controls at a point strictly prior to that at which its parent enters ’s message state (it is convenient to consider the genesis block a member of all message states at all timeslots). is defined to be an ancestor of , and all of the ancestors of are also defined to be ancestors of . If is not the genesis block, then it must have the genesis block as an ancestor. At any point during the duration, the set of broadcast blocks thus forms a tree structure. If is a set of messages, then we say that it is downward closed if it contains the parents of all blocks in . By a leaf of , we mean a block in which is not a parent of any block in . If is downward closed set of blocks and contains a single leaf, then we say that is a chain.

Generalising the model to DAGs. It is only for the sake of simplicity that we assume each block has a unique parent block. The model is chosen to be a sweet spot of being expressible enough to capture many different types of blockchains and not so cumbersome as to obscure the main issues. Only small modifications are then required to deal with DAGS etc.

2.5. The extended protocol and the meaning of probabilistic statements

To define what it means for a protocol to be secure or live, we first need a notion of confirmation for blocks. This is a function mapping any message state to a chain that is a subset of that message state, in a manner that depends on the protocol inputs, including a parameter called the security parameter. The intuition behind is that it should upper bound the probability of false confirmation. Given any message state, returns the set of confirmed blocks.

In Section 2.2, we stipulated that a permissionless protocol is a pair . In general, however, a protocol might only be considered to run relative to a specific notion of confirmation . We will refer to the triple as the extended protocol. Often we will suppress explicit mention of , and assume it to be implicitly attached to a given protocol. We will talk about a protocol being live, for example, when it is really the extended protocol to which the definition applies. It is important to understand, however, that the notion of confirmation is separate from , and does not impact the instructions of the protocol. In principle, one can run the same Bitcoin protocol relative to a range of different notions of confirmation. While the set of confirmed blocks might depend on , the instructions of the protocol do not, i.e. with Bitcoin, one can require five blocks for confirmation or ten, but this does not affect the process of building the blockchain.

For a given permissionless protocol, another way to completely specify an execution (beyond that described in Section 2.1) is via the following breakdown:

  1. The determined variables (such as and );

  2. The set of processors and their public keys;

  3. The state transition diagram for the adversary ;

  4. The resource pool (which may or may not be undetermined);

  5. The timing rule;

  6. The probabilistic responses of the permitter.

With respect to the extended protocol , we call a particular set of choices for (I1)- (I5) a protocol instance. Generally, when we discuss an extended protocol, we do so within the context of a setting, which constrains the set of possible protocol instances. The setting might restrict the set of resource pools to those in which the adversary is given a limited resource balance, for example. When we make a probabilistic statement to the effect that a certain condition holds with at most/least a certain probability, this means that the probabilisitic bound holds for all protocol instances consistent with the setting. Where convenient, we may also refer to the pair as the extended protocol, where .

2.6. Defining the timed, sized and multi-permitter settings

In Section 2.2, we gave an example to show how the framework of (Lewis-Pye and Roughgarden, 2021) can be used to model a PoW protocol like Bitcoin. In that context the resource pool is a function , which is best modelled as undetermined, because one does not know in advance how the hashrate of each public key (or even the total hashrate) will vary over time. The first major difference for a PoS protocol is that the resource balance of each public key now depends on the message state (as is also the case for some proof-of-space protocols, depending on the implementation), and may also be a function of time.101010It is standard practice in PoS blockchain protocols to require a participant to have a currency balance that has been recorded in the blockchain for at least a certain minimum amount of time before they can produce new blocks, for example. So, a given participant may not be permitted to extend a given chain of blocks at timeslot , but may be permitted to extend the same chain at a later timeslot . So the resource pool is a function . A second difference is that is determined, because one knows from the start how the resource balance of each participant depends on the message state as a function of time. Note that advance knowledge of does not mean that one knows from the start which processors will have large resource balances throughout the execution, unless one knows which messages will be broadcast. A third difference, to which we have already alluded, is that PoS protocols are best modelled in the timed setting. A fourth difference is that PoW protocols are best modelled by allowing a single request to the oracle for each public key at each timeslot, while this is not necessarily true of PoS protocols.

In (Lewis-Pye and Roughgarden, 2021), the sized/unsized, timed/untimed, and single/multi-permitter settings were defined to succinctly capture these differences. The idea is that all permissionless protocols run relative to a resource pool and the difference between PoW and PoS and other permissionless protocols is whether we are working in the sized/unsized, timed/untimed, and single/multi-permitter settings. If one then comes to consider a new form of protocol, such as proof-of-space, theorems that have been proved for all protocols in the unsized setting (for example) will still apply, so long as these new protocols are appropriately modelled in that setting. So the point of this approach is that, by blackboxing the precise mechanics of the processor selection process (whereby processors are selected to do things like broadcast new blocks of transactions), we are able to focus instead on properties of the selection process that are relevant for protocol design. This allows for the development of a general theory that succinctly describes the relevant merits of different forms of protocol. The sized/unsized, timed/untimed, and single/multi-permitter settings are defined below.

  1. The timed and untimed settings. There are two differences between the timed and untimed settings. The first concerns the form of requests, as detailed in Section 2.2. We also require that the following holds in the timed setting: For each broadcast message , there exists a unique timeslot such that permission to broadcast was given in response to some request , and is computable from . We call the timestamp of .

  2. The sized and unsized settings. We call the setting sized if the resource balance is determined. By the total resource balance we mean the function defined by . For the unsized setting, and are undetermined, with the only restrictions being:

    1. only takes values in a determined interval , where (meaning that, although and are determined, protocols will be required to function for all possible and , and for all undetermined consistent with , subject to (ii) below).111111We consider resource pools with range restricted in this way, because it turns out to be an overly strong condition to require a protocol to function without any further conditions on the resource pool, beyond the fact that it is a function to . Bitcoin will certainly fail if the total resource balance decreases sufficiently quickly over time, or if it increases too quickly, causing blocks to be produced too quickly compared to .

    2. There may also be bounds placed on the resource balance of public keys owned by the adversary.

  3. The multi-permitter and single-permitter settings. In the single-permitter setting, each processor may submit a single request of the form or (depending on whether we are in the timed setting or not) for each at each timeslot, and it is allowed that . In the multi-permitter setting, processors can submit any number of requests for each key at each timeslot, but they must all satisfy the condition that .

PoW protocols will generally be best modelled in the untimed, unsized and single-permitter settings. They are best modelled in the untimed setting, because a processor’s probability of being granted permission to broadcast a block at timeslot (even if that block has a different timestamp) depends on their resource balance at , rather than at any other timeslot. They are best modelled in the unsized setting, because one does not know in advance of the protocol execution the amount of mining which will take place at a given timeslot in the future. They are best modelled in the single-permitter setting, so long as permission to broadcast is block-specific.

PoS protocols are generally best modelled in the timed, sized and multi-permitter settings. They are best modelled in the timed setting, because blocks will generally have non-manipulable timestamps, and because a processor’s ability to broadcast a block may be determined at a timestamp even through the probability of success depends on their resource balance at other than . They are best modelled in the sized setting, because the resource pool is known from the start of the protocol execution. They are best modelled in the multi-permitter setting, so long as permission to broadcast is not block-specific, i.e. when permission is granted, it is to broadcast a range of permissible blocks at a given position in the blockchain.

All of this means that it will generally be straightforward to classify protocols with respect to the theorems from this paper that apply to them. Since Bitcoin and Prism

(Bagaria et al., 2019) are PoW protocols, for example, Theorem 5.1 applies to those protocols. Since Snow White, Ouroboros (Kiayias et al., 2017) and Algorand are PoS protocols, Theorems 3.3 and 5.6 apply to those protocols. Note that there are a large number of protocols, such as Tendermint (Buchman, 2016) and Hotstuff (Yin et al., 2019), which are formally described as permissioned protocols, but which can be implemented as PoS protocols so that Theorems 3.3 and 5.6 will then apply.

2.7. Defining liveness

There are a number of papers that successfully describe liveness and security notions for blockchain protocols (Garay et al., 2018; Pass et al., 2016). Our interest here is in identifying the simplest definitions that suffice to express our later results. To this end, it will be convenient to give a definition of liveness that is more fine-grained than previous definitions, in the sense that it allows us to separate out the security parameter and the number of timeslots in the duration (in previous accounts the number of timeslots in the duration is a function of the security parameter). Consider a protocol with a notion of confirmation , and let denote the number of blocks in for any message state . For timeslots , let be the maximum value for any which is a message state of any processor at any timeslot , and let be the minimum value for any which is a message state of any processor at timeslot . We say that is a growth interval if . For any duration , let be the number of timeslots in . For which takes values in depending on and , let us say that is sublinear in if, for each and each , for all sufficiently large values of (the motivation for considering sublinearity will be described shortly).

Definition 2.1 ().

A protocol is live if, for every choice of security parameter and duration , there exists , which is sublinear in , and such that for each pair of timeslots the following holds with probability at least : If and is entirely synchronous, then is a growth interval.

So, roughly speaking, a protocol is live if the number of confirmed blocks can be relied on to grow during synchronous intervals of sufficient length. The reason we require to be sublinear in is so that the number of confirmed blocks likely increases with sufficient increase in synchronous duration. For example, a protocol that confirms a block with probability only at each timeslot should not be considered live. Note also, that while Definition 2.1 only refers explicitly to protocols, it is really the extended protocol to which the definition applies. The following stronger notion will also be useful.

Definition 2.2 ().

A protocol is uniformly live if, for every choice of security parameter and duration , there exists , which is sublinear in , and such that the following holds with probability at least : For all pairs of timeslots , if and is entirely synchronous, then is a growth interval.

The difference between being live and uniformly live is that the latter definition requires that, with probability at least , all appropriate intervals are growth intervals. The former definition only requires the probabilistic bound to hold for each interval individually. The reader’s immediate reaction might be that it should follow from the Union Bound that Definitions 2.1 and 2.2 are essentially equivalent. This is not so. Firstly, this is because the protocol and notion of confirmation take the security parameter as input. Nevertheless, one might think that if a protocol is live then a ‘recalibration’, which takes some appropriate transformation of the security parameter as input, should necessarily be uniformly live. This does not follow (in part) because there is no guarantee that the resulting will be sublinear in – see Section 4 for a detailed analysis.

2.8. Defining security

Roughly speaking, security requires that confirmed blocks normally belong to the same chain. Let us say that two distinct blocks are incompatible if neither is an ancestor of the other, and are compatible otherwise. Suppose that, for some processor , the message state at is . If , then we say that is confirmed for at .

Definition 2.3 (Security).

A protocol is secure if the following holds for every choice of security parameter , for every and for all timeslots in the duration: With probability , all blocks which are confirmed for at are compatible with all those which are confirmed for at .

The following stronger notion will also be useful.

Definition 2.4 (Uniform Security).

A protocol is uniformly secure if the following holds for every choice of security parameter : With probability , there do not exist incompatible blocks , timeslots and such that is confirmed for at for .

The difference between security and uniform security is that the latter requires the probability of even a single disagreement to be bounded, while the former only bounds the probability of disagreement for each pair of processors at each timeslot pair. Just as for liveness and uniform liveness, it does not follow from the Union Bound that security is essentially equivalent to uniform security. In Section 4 we will perform a detailed analysis of the relationship between these notions.

3. Certificates in the partially synchronous setting

The definitions of this and subsequent sections are all new to this paper, unless explicitly stated otherwise. The rough idea is that ‘certificates’ should be proofs of confirmation. Towards formalising this idea, let us first consider a version which is too weak.

Definition 3.1 ().

If then we refer to as a subjective certificate for .

We will say that a set of messages is broadcast if every member is broadcast, and that is broadcast by timeslot if every member of is broadcast at a timeslot (different members potentially being broadcast at different timeslots). If is a subjective certificate for , then there might exist for which . So the fact that is broadcast does not constitute proof that is confirmed with respect to any processor. When do we get harder forms of proof than subjective certificates? Definition 3.2 below gives a natural and very simple way of formalising this.

Definition 3.2 ().

We say that a protocol with a notion of confirmation produces certificates if the following holds with probability when the protocol is run with security parameter : There do not exist incompatible blocks , a timeslot and which are broadcast by , such that for .

It is important to stress that, in the definition above, the ’s are not necessarily the message states of any processor, but are rather arbitrary subsets of the set of all broadcast messages. The basic idea is that, if a protocol produces certificates, then subjective certificates constitute proof of confirmation. Algorand is an example of a protocol which produces certificates: The protocol is designed so that it is unlikely that two incompatible blocks will be produced at any point in the duration together with appropriate committee signatures verifying confirmation for each.

Our next aim is to show that, in the partially synchronous setting, producing certificates is equivalent to security. In fact, producing certificates is clearly at least as strong as uniform security, so it suffices to show that if a protocol is secure then it must produce certificates.

Theorem 3.3 ().

If a protocol is secure in the partially synchronous setting then it produces certificates.

Proof.

Towards a contradiction, suppose that the protocol with notion of confirmation is secure in the partially synchronous setting, but that there exists a protocol instance121212See Section 2.5 for the definition of a protocol instance. with security parameter , such that the following holds with probability : There exist incompatible blocks , a timeslot and which are broadcast by , such that for . This means that the following holds with probability for , which is the last timeslot in the duration: There exist incompatible blocks and which are broadcast by , such that for . Consider the protocol instance which has the same values for determined variables as , the same state transition diagram for the processor of the adversary and the same set of processors with the same set of public keys, except that now there are two extra processors and . Suppose that the resource pool for is the same as that for when restricted to public keys other than those in and , and that all keys in and have zero resource balance throughout the duration. Suppose further, that the timing rule for is the same as that for when restricted to tuples such that and , but that now all timeslots are asynchronous. According to the definition of Section 2.2, and since all keys in and

have zero resource balance throughout the duration, it follows by induction on timeslots that the probability distribution on the set of broadcast messages is the same at each timeslot for

as for , independent of which messages are received by and . It therefore holds for the protocol instance that with probability there exist incompatible blocks , and which are broadcast by , such that for . Now suppose that and do not receive any messages until , and then receive the message sets and (if they exist) respectively. This suffices to demonstrate that the definition of security is violated with respect to , , and . ∎

Corollary 3.4 ().

Security and uniform security are equivalent in the partially synchronous setting.

Proof.

This follows from Theorem 3.3 and the fact that producing certificates clearly implies uniform security. ∎

4. Security and uniform security in the synchronous setting

Having dealt with the partially synchronous setting, our next task is to consider the synchronous setting. To do so, however, we first need to formalise the notion of a recalibration.

4.1. Defining recalibrations

Theorem 3.3 seems to tie things up rather neatly for the partially synchronous setting. In particular, the equivalence of security and uniform security meant that we were spared having to carry out a separate analysis for each security notion. It is not difficult to see, however, that the two security notions will not be equivalent in the synchronous setting. To see this, we can consider the example of Bitcoin. Suppose first that we operate in the standard way for Bitcoin, and use a notion of confirmation that depends only on the security parameter , and not on the duration , so that the number of blocks required for confirmation is just a function of . In this case, the protocol is secure in the synchronous setting (Garay et al., 2018). It is also clear, however, that this protocol will not be uniformly secure in a setting where the adversary controls a non-zero amount of mining power: If a fixed number of blocks are required for confirmation then, given enough time, the adversary will eventually complete a double spend (i.e. the adversary will double spend with probability tending to 1 as the number of timeslots tends to infinity). That said, it is also not difficult to see how one might ‘recalibrate’ the protocol to deal with different durations – to make the protocol uniformly secure, the number of blocks required for confirmation should be a function of both and .

The point of this subsection is to formalise the idea of recalibration and to show that, if a protocol is secure, then (under fairly weak conditions) a recalibration will be uniformly secure. The basic idea is very simple – one runs the initial (unrecalibrated) protocol for smaller values of as the duration increases, but one has to be careful that the resulting is sublinear in .

Definition 4.1 ().

We say is a recalibration of the extended protocol if running given certain inputs means running for a computable transformation of those inputs, and then terminating after many steps are complete.

So, if running with security parameter and for many timeslots means running with input parameters that specify a security parameter and that specify a duration consisting of many timeslots, and then terminating after many timeslots have been completed, then is a recalibration of .131313The choices and are arbitrarily chosen for the purpose of example. The reader might wonder why one should specify a duration of timeslots and then terminate after many. This is because the instructions of the first timesteps can depend on the intended duration. In Algorand, committee sizes will depend on the intended duration, for example. Note also, that we allow the recalibration to use a different notion of confirmation.

In the following, we say that is independent of if for all and all . When is independent of , we will often write for .

Definition 4.2 ().

In the bounded user setting we assume that there is a finite upper bound on the number of processors, which holds for all protocol instances.141414Note that the requirement here is that the number of processors is bounded, rather than the number of public keys.

Proposition 4.3 ().

Consider the synchronous and bounded user setting. Suppose satisfies liveness with respect to , that is independent of , and that for each , for all sufficiently small . If is secure, there exists a recalibration of that is uniformly live and uniformly secure.

The conditions on in the statement of Proposition 4.3 can reasonably be regarded as weak, because existing protocols which are not already uniformly secure will normally satisfy the conditions that: is independent of , and; For some constant and any , we have . The example of Bitcoin might be useful for the purposes of illustration here. Bitcoin is secure in the synchronous setting, and the number of blocks required for confirmation is normally considered to be independent of the duration. The number of blocks required for confirmation does depend on how sure one needs to be that an adversary cannot double spend in any given time interval, but it’s also true that an adversary’s chance of double spending in a given time interval decreases exponentially in the number of blocks required for confirmation as well. So Bitcoin is an example of a protocol satisfying and above.

Proof of Proposition 4.3.

It is useful to consider a security notion that is intermediate between security and uniform security. For the purposes of the following definition, we say that a block is confirmed at timeslot if there exists at least one processor for whom that is the case.

Definition 4.4 (Timeslot Security).

A protocol is timeslot secure if the following holds for every choice of security parameter , and for all timeslots in the duration: With probability , all blocks which are confirmed at are compatible with all blocks which are confirmed at .

So the difference between timeslot security and uniform security is that the latter requires the probability of even a single disagreement to be bounded, while the former only bounds the probability of disagreement for each pair of timeslots. Similarly, the difference between security and timeslot security is that, for each pair of timeslots, the latter requires the probability of even a single disagreement to be bounded, while the former only bounds the probability of disagreement for each pair of processors at that timeslot pair.

Now suppose is live and secure, and that the conditions of Proposition 4.3 hold. Then it follows directly from the Union Bound that, if the number of users is bounded, then some recalibration of is live and timeslot secure and satisfies the conditions of Proposition 4.3. Since a recalibration of a recalibration of is a recalibration of , our main task is therefore to show that, if is live and timeslot secure and the conditions of Proposition 4.3 hold, then there exists a recalibration of that is uniformly live and uniformly secure.

So suppose is live and timeslot secure, and that the conditions of Proposition 4.3 hold. Suppose we are given and as inputs to our recalibration . We wish to find an appropriate security parameter and a duration to give as inputs to and , so that uniform security is satisfied with respect to and if we run with inputs and and then terminate after many timeslots. The difficulty is to ensure that remains sublinear in . To achieve this, let , set and choose , so that is the first timeslots in . This defines the recalibration. It remains to establish uniform liveness and uniform security.

For uniform liveness we must have that, for each , for all sufficiently large values of – if this condition holds then it follows from the Union Bound that our recalibration will satisfy uniform liveness (and the required sublinearity in ) with respect to . The condition holds since we are given that for each , for all sufficiently small . Suppose given , and put . Then we have that, for all sufficiently large :

Next we must show that the conditions for uniform security are satisfied. Suppose is given inputs and and is actually run for many timeslots. We aim to show that, with probability , there do not exist incompatible blocks , timeslots and such that is confirmed for at for . Let be the last timeslot of the duration and define . The basic idea is that the two following conditions hold with high probability: (a) is a growth interval, and (b) There does not exist , processors and incompatible blocks , such that is confirmed for at and is confirmed for at . When both these conditions hold, and since , this suffices to show that no incompatible and confirmed blocks exist during the duration . Now let us see that in more detail.

By the choice of , . It follows from the definition of liveness that below fails to hold with probability :

  1. is a growth interval.

Note that, so long as holds, every user has more confirmed blocks at than any user does at any timeslot in . It also follows from the Union Bound, and the definition of liveness and timeslot security, that below fails to hold with probability :

  1. There does not exist , processors and incompatible blocks , such that is confirmed for at and is confirmed for at .

Now note that:

  1. If and both hold, then there do not exist incompatible blocks , timeslots and such that is confirmed for at for .

  2. With probability , and both hold.

So uniform security is satisfied with respect to and , as required. ∎

Definition 4.5 ().

We say has standard functionality if it is uniformly live and uniformly secure. We say that a recalibration of is faithful if it has standard functionality when does.

Proposition 4.3 justifies concentrating on protocols which have standard functionality where it is convenient to do so, since protocols which are live and secure will have recalibrations with standard functionality, so long as the rather weak conditions of Proposition 4.3 are satisfied. Again, when we talk about the security and liveness of a protocol, it is really the extended protocol that we are referring to.

5. Certificates in the synchronous setting

5.1. The synchronous and unsized setting

As outlined in the introduction, part of the aim of this paper is to give a positive answer to Q3, by showing that whether a protocol produces certificates comes down essentially to properties of the processor selection process. In the unsized setting protocols cannot produce certificates. In the sized setting, recalibrated protocols will automatically produce certificates, at least if they are of ‘standard form’. For the partially synchronous setting, the results of (Lewis-Pye and Roughgarden, 2021) and Section 3 already bear this out: The sized setting is required for security and all secure protocols must produce certificates. The following theorem now deals with the unsized and synchronous setting. Recall that, in the unsized setting, the total resource balance belongs to a determined interval . We say that the protocol operates ‘in the presence of a non-trivial adversary’ if the setting allows that the adversary may have resource balance at least throughout the duration.

Theorem 5.1 ().

Consider the synchronous and unsized setting. If a protocol is live then, in the presence of a non-trivial adversary, it does not produce certificates.

Proof.

The basic idea is that the adversary with resource balance at least can ‘simulate’ their own execution of the protocol, in which only they have non-zero resource balance, while the non-faulty processors carry out an execution in which the adversary does not participate. Simulating their own execution means that the adversary carries out the protocol as usual, while ignoring messages broadcast by the non-faulty processors, but does not initially broadcast messages when given permission to do so. Liveness (together with the fact that the resource pool is undetermined) guarantees that, with high probability, both the actual and simulated executions produce blocks which look confirmed from their own perspective. These blocks will be incompatible with each other and, once the adversary finally broadcasts the messages that they have been given permission for, these blocks will all have subjective certificates which are subsets of the set of broadcast messages. This suffices to show that the protocol does not produce certificates.

More precisely, we consider two instances of the protocol and in the synchronous and unsized setting, which have the same values for all determined variables – including the same sufficiently small security parameter and the same sufficiently long duration – and also have the same set of processors and the same message delivery rule, but which differ as follows:

  • In , a set of processors control public keys in a set , which are the only public keys that do not have zero resource balance throughout the duration. The total resource balance has a fixed value, say.

  • In , it is the adversary who controls the public keys in , and those keys have the same resource balance throughout the duration as they do in . Now, however, another set of processors control public keys in a set (disjoint from ), and the public keys in also have total resource balance throughout the duration, i.e. the resource balances of these keys always add to .

In , we suppose that the adversary simulates the processors in for (which can be done with the single processor ), which means that the adversary carries out the instructions for those processors, with the two following exceptions. Until a certain timeslot , to be detailed subsequently, they:

  1. Ignore all messages broadcast by non-faulty processors, and;

  2. Do not actually broadcast messages when permitted, but consider them received by simulated processors in as per the message delivery rule.

For (so long as the duration is sufficiently long), liveness guarantees the existence of a timeslot for which the following holds with probability :

  1. At there exists a set of broadcast messages and a block such that .

For , liveness guarantees the existence of a timeslot for which the following holds with probability :

  1. At there exists a set of broadcast messages and a block such that .

Choose . Our framework stipulates that the instructions of the protocol for a given user at a given timeslot are a deterministic function of their present state and the message set and permission set received at that timeslot. It also stipulates that the response of the permitter to a request is a probabilistic function of the determined variables, , and of . Since we are working in the unsized setting, and have the same determined variables. It therefore follows by induction on timeslots , that the following is true at all points until the end of timeslot :

  1. The probability distribution for on the set of permission sets given by the permitter is identical to the probability distribution for on the set of permission sets given by the permitter to the adversary.

Now suppose that at timeslot the adversary broadcasts all messages for which they have been given permission by the permitter. Note that, according to the assumptions of Section 2.4, any block broadcast by the adversary at will be incompatible with any block that has been broadcast by any honest user up to that point. Combining , and , we see that (so long as is sufficiently small that ) the following holds with probability for and : There exist incompatible blocks , and which are broadcast by the end of , such that for . This suffices to show that the protocol does not produce certificates. ∎

5.2. The synchronous and sized setting

The example of sized Bitcoin. Our aim in this subsection is to show that, if we work in the synchronous and sized setting, and if a protocol is of ‘standard form’, then a recalibration will produce certificates. To make this precise, however, it will be necessary to recognise the potentially time dependent nature of proofs of confirmation. To explain this idea, it is instructive to consider the example of Bitcoin in the sized setting: The protocol is Bitcoin, but now we are told in advance precisely how the hash rate capability of the network varies over time, as well as bounds on the hash rate of the adversary.151515Normally we think of PoW protocols as operating in the unsized setting, precisely because such guarantees on the hash rate are not realistic. To make things concrete, let us suppose that the total hash rate is fixed over time, and that the adversary has 10% of the hash rate at all times. Suppose that, during the first couple of hours of running the protocol, the difficulty setting is such that the network as a whole (with the adversary acting honestly) will produce an expected one block every 10 minutes. Suppose further that, after a couple of hours, we see a block which belongs to a chain , in which it is followed by 10 blocks. In this case, the constraints we have been given mean that it is very unlikely that does not belong to the longest chain. So, at that timeslot, might be considered a proof of confirmation for , i.e. the existence of the chain can be taken as proof that is confirmed. The nature of this proof is time dependent, however. The same set of blocks (i.e. ) a large number of timeslots later would not constitute proof of confirmation.

If we now consider a PoS version of the example above, modified to work for Snow White rather than Bitcoin, then the proof produced will not be time dependent. This is because PoS protocols function in the timed setting, i.e. when permission is given to broadcast in response to a request , other users are able to determine from . In order to prove that (recalibrated) protocols in the sized setting produce certificates, we will have to make the assumption that we are also working in the timed setting.

Protocols in standard form. The basic intuition behind the production of certificates in the sized setting can be seen from the example of “Sized Bitcoin” above. Once a block is confirmed, non-faulty processors will work ‘above’ this block. So long as those processors possess a majority of the total resource balance, and so long as the permitter reflects this fact in the permissions it gives, then those non-faulty processors will broadcast a set of messages which suffices (by its existence rather than the fact that it is the full message state of any user) to give proof of confirmation. This proof of confirmation might be temporary, but it will not be temporary in the timed setting.

This intuitive argument, however, assumes that the protocol satisfies certain standard properties. As alluded to above, there is an assumption that the set of messages broadcast by a group of processors will reflect their resource balances and that the adversary will have a minority resource balance. There is also an assumption that broadcast messages will (in some sense) point to a particular position on the blockchain. So we will have to formalise these ideas, and the results we prove will only hold modulo the assumption that these standard properties are satisfied.

First, let us formalise the idea that messages always point to a position on the blockchain.

Definition 5.2 ().

We say that a protocol is in standard form if it satisfies all of the following:

  • The protocol has standard functionality (see Definition 4.5).

  • Every broadcast message is ‘attached’ to a specific block (blocks being attached to themselves).

  • While is confirmed for , the state transition diagram will only instruct to broadcast messages which are attached to or descendants of .

Reflecting the resource pool. Now let us try to describe how the permitter might reflect the resource pool. We will need a simple way to say that one set of processors consistently has a higher resource balance than another.

Definition 5.3 ().

For , we say a set of public keys dominates another set , denoted , if the following holds for all sets of broadcast messages and all timeslots :

Next, we will need to formalise the idea that, if one set of keys dominates another, then they will be able to broadcast discernibly different sets of messages. Recall that, in the timed setting, each message corresponds to a timeslot , which can be determined from . We write to denote the set . We will say that the set of keys is directed to broadcast if, for every , there is some member of that is given permission to broadcast and is directed to broadcast by the protocol. We will say that is able to broadcast if, for every , there is some member of that is given permission to broadcast . We define . We let be the set of functions (so that the total resource balance ). We say that a set of keys has total resource balance if . In the definition below, we say is sublinear in if,  for each , and for every , it holds that for all sufficiently large .

Definition 5.4 ().

We say that reflects the resource pool if there exist computable finite valued functions and , such that:

  1. is sublinear in .

  2. If has total resource balance , and if , then, when the protocol is run with security parameter and for many timeslots, the following holds with probability : For all intervals of timeslots with , there exists some element of which is directed to broadcast, while is not able to broadcast any element of .

So in Definition 5.4, specifies a number of timeslots. Then specifies certain sets of messages such that, if and has total resource balance , then can be expected to broadcast one of these sets in any interval of sufficient length (i.e. the length specified by ). To make this interesting, we also have that can be expected not to make such broadcasts. To see why this is a natural and reasonable condition to assume, it is instructive to consider the example of Sized Bitcoin. Suppose that in some execution the honest users always have at least 60% of the mining power. Then, over any long period of time , we can be fairly sure that honest users will get to make at least 50% of the expected number of block broadcasts, while the adversary is unlikely to be able to make such broadcasts if

is large enough. In fact, the exponentially fast convergence for the law of large numbers guaranteed by bounds like Hoeffding’s inequality, means

only needs to grow with , where is the probability of error (i.e. the probability these conditions on the block broadcasts don’t hold in a given interval). It is therefore not difficult to see that Sized Bitcoin would reflect the resource pool if it could be implemented in a timed setting. Similar arguments can be made for all well known PoS protocols,161616The example of Snow White was discussed previously. As suggested in Section 1, one way to define in the context of Snow White is to consider long chains of sufficient density, meaning that they have members corresponding to most possible timeslots, that they cannot likely be produced by a (sufficiently bounded) adversary. and these are implemented in the timed setting.

Definition 5.5 ().

In the bounded adversary setting it is assumed that:

  1. for some determined input parameter , where is the set of keys controlled by non-faulty processors, and is the the set of keys controlled by the adversary.

  2. reflects the resource pool.

Finally, we can now formalise the idea that under standard conditions, standard protocols in the sized setting produce certificates.

Theorem 5.6 ().

Consider the timed, bounded adversary and sized setting. If is in standard form, then there exists a faithful recalibration that produces certificates.

Proof.

To define our recalibration , suppose we are given values for and . We need to specify a value to give as input to (we will leave other values unchanged), and we must also define . Then we need to show that the new extended protocol is uniformly live and produces certificates.

We define . Towards defining , suppose that satisfies uniform liveness with respect to . We divide the duration into intervals of length , by defining . From the definition of uniform liveness we have the following.

  1. With probability it holds that, for all with , all users have at least many confirmed blocks by the end of timeslot .

Now suppose satisfies Definition 5.4 with respect to and . For each , define . Let be the interval , and write to denote . Let be the set of keys controlled by non-faulty processors, and let be the the set of keys controlled by the adversary. According to Definition 5.4, we can then conclude that:

  1. It holds with probability that, whenever is contained in the duration, there exists some element of which is directed to broadcast, while is not able to broadcast any element of this set.

Since is uniformly secure, we also know that:

  1. With probability , there do not exist incompatible blocks , timeslots and such that is confirmed for at for .

So now define to be all those in for which there exists such that all of the following hold: (i) ; (ii) , and; (iii) For some chain of length with leaf , all messages in are attached or its descendants.

Now if , then let be the (unique) such that (i)–(iii) hold for i and , let be as specified in (iii) for , and define . We also define . This function is almost the notion of confirmation that we want for our recalibration, but the problem is that it is only defined for very specific values of . We will use to help us define that is defined for all possible . Combining , and , and the definition of , it follows that with probability both of the following hold:

  1. If are both broadcast, then all blocks in are compatible with all those in .

  2. For every , there exists which is broadcast and such that .

In order to define for our recalibration, we can then proceed as follows. Given arbitrary , choose such that and is maximal, or if there exists no satisfying these conditions then define . We define . It follows from (1) and (2) above that produces certificates and satisfies uniform liveness with respect to . ∎

References

  • (1)
  • Alchieri et al. (2008) Eduardo AP Alchieri, Alysson Neves Bessani, Joni da Silva Fraga, and Fabíola Greve. 2008. Byzantine consensus with unknown participants. In International Conference On Principles Of Distributed Systems. Springer, 22–40.
  • Bagaria et al. (2019) Vivek Bagaria, Sreeram Kannan, David Tse, Giulia Fanti, and Pramod Viswanath. 2019. Prism: Deconstructing the blockchain to approach physical limits. In Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security. 585–602.
  • Bentov et al. (2016) Iddo Bentov, Rafael Pass, and Elaine Shi. 2016. Snow White: Provably Secure Proofs of Stake. IACR Cryptology ePrint Archive 2016, 919 (2016).
  • Buchman (2016) Ethan Buchman. 2016. Tendermint: Byzantine fault tolerance in the age of blockchains. Ph.D. Dissertation.
  • Canetti (2001) Ran Canetti. 2001. Universally composable security: A new paradigm for cryptographic protocols. In Proceedings 42nd IEEE Symposium on Foundations of Computer Science. IEEE, 136–145.
  • Cavin et al. (2004) David Cavin, Yoav Sasson, and André Schiper. 2004. Consensus with unknown participants or fundamental self-organization. In International Conference on Ad-Hoc Networks and Wireless. Springer, 135–148.
  • Chen et al. (2018) Jing Chen, Sergey Gorbunov, Silvio Micali, and Georgios Vlachos. 2018. ALGORAND AGREEMENT: Super Fast and Partition Resilient Byzantine Agreement. IACR Cryptol. ePrint Arch. 2018 (2018), 377.
  • Chen and Micali (2016) Jing Chen and Silvio Micali. 2016. Algorand. arXiv preprint arXiv:1607.01341 (2016).
  • Dwork et al. (1988) Cynthia Dwork, Nancy A. Lynch, and Larry Stockmeyer. 1988. Consensus in the Presence of Partial Synchrony. J. ACM 35, 2 (1988), 288–323.
  • Garay et al. (2018) Juan A Garay, Aggelos Kiayias, and Nikos Leonardos. 2018. The Bitcoin Backbone Protocol: Analysis and Applications. (2018).
  • Kiayias et al. (2017) Aggelos Kiayias, Alexander Russell, Bernardo David, and Roman Oliynykov. 2017. Ouroboros: A provably secure proof-of-stake blockchain protocol. In Annual International Cryptology Conference. Springer, 357–388.
  • Lewis-Pye and Roughgarden (2021) Andrew Lewis-Pye and Tim Roughgarden. 2021. Byzantine Generals in the Permissionless Setting. arXiv preprint arXiv:2101.07095 (2021).
  • Lynch (1996) Nancy A Lynch. 1996. Distributed algorithms. Elsevier.
  • Nakamoto et al. (2008) Satoshi Nakamoto et al. 2008. Bitcoin: A peer-to-peer electronic cash system.(2008).
  • Okun (2005) Michael Okun. 2005. Distributed computing among unacquainted processors in the presence of Byzantine failures. Hebrew University of Jerusalem.
  • Pass et al. (2016) Rafael Pass, Lior Seeman, and abhi shelat. 2016. Analysis of the Blockchain Protocol in Asynchronous Networks. eprint.iacr.org/2016/454.
  • Ren (2019) Ling Ren. 2019. Analysis of nakamoto consensus. Technical Report. Cryptology ePrint Archive, Report 2019/943.(2019). https://eprint. iacr.org.
  • Yin et al. (2019) Maofan Yin, Dahlia Malkhi, Michael K Reiter, Guy Golan Gueta, and Ittai Abraham. 2019. HotStuff: BFT consensus with linearity and responsiveness. In Proceedings of the 2019 ACM Symposium on Principles of Distributed Computing. 347–356.

6. Appendix – Table 1.

term meaning
a block
a notion of confirmation
the duration
bound on message delay during synchronous
intervals
the security parameter
a protocol instance
a message
a set of messages
the set of all possible sets of messages
a permitter oracle
a processor
a permission set
a permissionless protocol
a request set
the resource pool
a state transition diagram
a message
a timeslot
a request in the timed setting
a timing rule
a public key
a request in the untimed setting
the set of all public keys
the set public keys for
Table 1. Some commonly used variables and terms.