Tool and library for analyzing FBASs like Stellar
Federated Byzantine Agreement Systems (FBASs) are a fascinating new paradigm in the context of consensus protocols. Originally proposed for powering the Stellar payment network, FBASs can be thought of as a middle way between typical permissionless systems (like Bitcoin) and permissioned approaches for solving consensus (like classical BFT protocols). Unlike Bitcoin and the like, validators must be explicitly chosen by peers. Unlike permissioned protocols, there is no need for the whole system to agree on the same set of validators. Instead, every node is free to decide for itself with whom it requires agreement. In this paper, we propose an intuitive yet precise methodology for determining whether the quorum systems resulting from such individual configurations can enable liveness and safety, respectively how many (byzantine) node failures they are away from losing these qualities. We apply our analysis approach and software to evaluate the effects of different node configuration policies, i.e., logics through which node configurations result from strategic considerations or an existing inter-node relationship graph. Lastly, we also investigate the reported "open-membership" property of FBASs. We observe that an often small group of nodes is exclusively relevant for determining safety and liveness "buffers", and prove that these top tiers are effectively "closed-membership" if maintaining safety is a core requirement.READ FULL TEXT VIEW PDF
Tool and library for analyzing FBASs like Stellar
We study Federated Byzantine Agreement Systems (FBASs), as originally proposed by Mazières (Mazières, 2015) and subsequently generalized by a range of other authors (Cachin and Tackmann, 2019; Losa et al., 2019). While research on consensus protocols has accelerated in the wake of global blockchain enthusiasm, developments still mostly fall in two extreme categories: permissionless, i.e., open-membership, as exemplified by Bitcoin’s notoriously energy-hungry ”Nakamoto consensus” (Nakamoto, 2008), and permissioned, with a closed-membership group of validators, as assumed both in the classical byzantine fault tolerance (BFT) literature (e.g., (Castro et al., 1999)) and many state-of-the art protocols from the blockchain world (e.g., (Yin et al., 2019)). The FBAS paradigm suggests a middle way: Each node defines its own rules about which groups of nodes it will consider as sufficient validators. If the sum of all such configurations fulfills a set of properties, protocols like the Stellar Consensus Protocol (SCP) (Mazières, 2015) can be defined that leverage the resulting structure for establishing a live and safe consensus system (Cachin and Tackmann, 2019; Losa et al., 2019; Álvaro García-Pérez and Schett, 2020; Álvaro García-Pérez and Gotsman, 2018; Lokhava et al., 2019). Which set of properties these are, how they result from specific configuration policies, and how they can be changed over time, are the topics of this paper. Specifically, our contributions are:
An intuitive yet precise framework for reasoning about safety and liveness guarantees in concrete FBAS instances.
An exploration of possible configuration policies and their effects, via simulating their application and analyzing the resulting FBASs.
Towards understanding the reported ”open membership” nature of FBASs, we formally prove that membership in an FBAS’ top tier is only ”open” if a violation of safety is considered acceptable.
As a companion to this publication, we furthermore release an extendable software framework for analyzing FBASs and simulating the effects of configuration policies.
Federated Byzantine Agreements Systems were first proposed in (Mazières, 2015), together with the Stellar Consensus Protocol (SCP), a first protocol for this setting. The viability of SCP has been proven formally (Lokhava et al., 2019; Álvaro García-Pérez and Schett, 2020; Álvaro García-Pérez and Gotsman, 2018) and the protocol is in active use in a large-scale payment network (Lokhava et al., 2019). The FBAS notion has furthermore been generalized in different ways, enabling the development of additional protocols and abstractions (Losa et al., 2019; Cachin and Tackmann, 2019). In this work, we are less interested in the mechanics of specific protocols for the FBAS setting but instead investigate the conditions they require for achieving safety, liveness and performance. We investigate to what extend the conditions to safety and liveness are threatened in different FBAS instances, and how individual node configuration policies influence global FBAS properties.
A heuristics-based methodology for analyzing specific FBAS instances was previously proposed in(Kim et al., 2019), focusing on the identification of central nodes and threats to FBAS liveness. We propose a novel analysis approach that is not heuristics-based and hence yields precise insights, based on a solid theoretic foundation. As in (Kim et al., 2019), we apply our methodology to snapshots of the live Stellar network (cf. Appendix C). We furthermore apply our methodology to synthetically generated FBAS resulting from the application of diverse node configuration policies.
Bracciali et al. (Bracciali et al., 2019) explore fundamental bounds on the decentrality in open quorum systems. One of their central arguments with regards to the FBAS paradigm is that quorum intersection, a crucial requirement to guaranteeing safety in protocols like SCP, is computationally intractable to determine and maintain, necessitating centralization if safety is a requirement. The NP-hardness of determining quorum intersection was previously also proven in (Lachowski, 2019), together, however, with practical algorithms for nevertheless tackling a wide range of interesting FBAS. Our own analysis framework adapts and extends these algorithms, and through its application to FBASs of different sizes we gain insights about the computational limitations that currently hold in practice. While, based on our analysis approach and its application to specific FBASs, we can confirm that nodes of higher influence (top tier nodes according to our choice of words) naturally emerge, we argue that it is not only the existence and size of such a group that determines ”centralization”, but also the fluidity of that group’s membership (which we explicitly investigate).
An alternative analysis methodology and software framework, also using algorithms from (Lachowski, 2019), has recently been presented in (Gaul et al., 2019). Among other things, the authors provide algorithms for determining the consequences of specific sets of nodes becoming faulty, whereas we propose and implement approaches for identifying all minimal sets of nodes that need to become faulty for an FBAS to lose safety and liveness guarantees. We were furthermore able to apply our method to significantly larger FBASs, and notably also FBASs generated through simulating the effect of quorum configuration policies.
The effects of quorum configuration on FBAS properties has also been studied in (Yoo et al., 2019), however with a formal modeling approach that was only tested in very small networks (). Chitra et al. (Chitra and Chitra, 2019) explore under which circumstances the selection of random validators is likely to result in an FBAS with quorum intersection, again using algorithms from (Lachowski, 2019). We argue that the selection of random validators is neither Sybil-proof (Douceur, 2002) nor sufficiently leveraging the individual configuration freedom offered by the FBAS paradigm. With respect to the design of more intricate configuration policies, we were so far able to discover mostly vague hints, such as that node operators should choose ”large” sets of ”high-quality” validators (Lokhava et al., 2019). We discuss different intuitive interpretations of the configuration options available to node operators and propose specific configuration policies based on these interpretations. We evaluate defined policies by simulating them to generate FBASs with hundreds of strongly connected nodes, and by analyzing the resulting FBASs.
In the following, we introduce core concepts of the FBAS paradigm that form our basis for reasoning about specific FBAS instances. We use terminology based on (Lokhava et al., 2019), (Mazières, 2015), (Lachowski, 2019) and the Stellar codebase (stellar-core). We deliberately choose this line of terminology over, say, (Cachin and Tackmann, 2019), as it allows for an easier application of our results to the current Stellar network and node software. We conjecture that our results are equally relevant for models such as (Cachin and Tackmann, 2019).
Our FBAS model is based on the concept of nodes. Whereas nodes usually represent individual machines, for the purposes of this paper, we typically assume that each node represents a distinct entity or organization. We will illustrate introduced concepts using examples. For example: these are the nodes Alice, Bob and Carol: . We will occasionally also use established terms in the context of consensus protocols, such as ”slot”, ”externalize” and ”faulty”, without formally introducing them. As an informal and approximate adaptation to the blockchain setting, a slot is a block of a given height, to externalize a value is to fill a block with contents, and a faulty node is one that violates protocol rules in arbitrary ways, e.g., assuming the worst-case scenario, via being under the control of an attacker that also controls all other faulty nodes.
In an FBAS, each node (respectively its human administrator) individually configures which other nodes’ opinions it should consider when participating in consensus. Configurations can express individual expectations, such as ”out of these nodes, at most will simultaneously cooperate to attack the system”, and can be used to strategically influence global system parameters. We delve significantly deeper into the semantics and ”goodness” of node configuration in Sec. 5, based on the conceptual frame developed in the following. On a conceptual level, the configuration of an FBAS node consists in the definition of quorum slices.
A Federated Byzantine Agreement System (FBAS) is a pair comprising a set of nodes and a quorum function specifying quorum slices for each node, where a node belongs to all of its own quorum slices—i.e., .
Informally, each quorum slice of a node describes a set of nodes that, should they all agree to externalize a value in a given slot, this is sufficient to also cause to externalize that value. While a useful abstraction for formally describing FBASs and protocols for the FBAS setting, quorum slices are an unwieldy configuration format in practice. In Stellar, the currently most relevant practical deployment of an FBAS, configuration therefore doesn’t happen via quorum slices but via quorum sets (Lokhava et al., 2019). Each quorum set defines a set of validator nodes , a set of inner quorum sets and a threshold value . Intuitively, this enables the encoding of notions such as ”out of these nodes, at least must agree” (satisfying the quorum set) and ”the sum of agreeing nodes in and satisfied inner quorum sets in must be at least ”.
The quorum set of a node is a tuple . For quorum sets of the form , we recursively define that a set of nodes satisfies iff and . We denote the set of of quorum slices that satisfy as —i.e., .
As hinted at by our notation, quorum sets and quorum slices can be transformed into one another. A straightforward (but generally not space-efficient) way to express the arbitrary quorum slices of node as a quorum set is , with . Quorum sets are translated back to quorum slices (values of ) by applying the function. For example (with ):
In the above example, and their quorum sets form an FBAS. Clearly, an FBAS cannot be modelled as a regular graph (with FBAS nodes as graph edges) without losing information. Graph-based analyses as in (Kim et al., 2019) can therefore result only in heuristic insights. An FBAS can be modelled as a directed hypergraph (Gallo et al., 1993). However, we find the quorum set abstraction easier to work with. In Sec. 5 we investigate strategies for bootstrapping robust FBASs from graphs, e.g., based on graphs expressing existing trust relationships between nodes.
A group of FBAS nodes that can, based on their quorum sets, by-themself externalize new values is called a quorum.
A set of nodes in FBAS is a quorum iff and contains a quorum slice for each member—i.e., .
This is equivalent to stating that satisfies the quorum sets of all . Quorums are therefore determined by the sum of all individual quorum set configurations. Continuing the previous example with nodes , we get the quorums . We say that has quorum availability despite faulty nodes iff there exists a that is a quorum in and consists of only non-faulty nodes. Quorum availability despite faulty nodes is a necessary condition to achieving liveness in an FBAS, i.e., ensuring that non-faulty nodes can externalize new values without the participation of any faulty nodes (Mazières, 2015).
If an FBAS enjoys liveness under some consensus protocol that honours , enjoys quorum availability despite faulty nodes.
Let be an FBAS that enjoys liveness under some consensus protocol that honours . This implies that a subset of nodes has externalized a new value without the participation of faulty nodes. Since an agreement has been reached, . Let . Then, there exists a quorum slice with , otherwise would not have agreed on the new value. Therefore, is a quorum and enjoys quorum availability despite faulty nodes. ∎
Given quorum availability despite faulty nodes, protocols like SCP can ensure liveness (Mazières, 2015). In the case of SCP, this was previously demonstrated through correctness proofs (Álvaro García-Pérez and Schett, 2020) as well as formal verification and practical deployment experience (Lokhava et al., 2019). Additional conditions to achieving liveness include the reaction (via quorum set adaptations, i.e. changes to ) to (detectable) timing attacks (Lokhava et al., 2019). We defer to works such as (Mazières, 2015; Losa et al., 2019; Cachin and Tackmann, 2019) for an in-depth exploration of the mechanics and guarantees of consensus protocols for the FBAS setting. In this paper, we implicitly assume that such protocols are in use and ”just work”.
A set of nodes in an FBAS enjoy safety if no two of them ever externalize different values for the same slot (Mazières, 2015). In a blockchain context, lack of safety can translate into the possibility of forks and double spends. Protocols such as SCP ensure that, if all nodes are honest and no two quorums ever externalize conflicting values, no two nodes in the FBAS will either. The situation that two quorums externalize different values is avoided by maintaining quorum intersection.
An FBAS enjoys quorum intersection iff any two of its quorums share a node—i.e., for all quorums and , .
For example, the set of quorums intersects, whereas introducing an additional quorum would break quorum intersection. In the latter scenario, and could induce two new, separated FBASs (Losa et al., 2019). In a similar was as for liveness, we say that an FBAS enjoys quorum intersection despite faulty nodes iff every pair of non-faulty quorums intersects in at least one non-faulty node. If quorum intersection despite faulty nodes is not given, safety cannot be guaranteed (although it can be maintained by chance).
Let be an FBAS and a consensus protocol that can, honouring the respective , ensure liveness in any FBAS with quorum availability despite faulty nodes. If can guarantee safety for all non-faulty nodes in , then enjoys quorum intersection despite faulty nodes.
Let and according to the above definitions. Without loss of generality, we assume that contains only non-faulty nodes. If does not enjoy quorum intersection, then there are two quorum so that . For , let be defined such that . Then both and form FBASs with quorum availability. As guarantees liveness in any FBAS with quorum availability, and especially in cases where and are separated due to connectivity issues, it is possible that and independently externalize values for the same slots. These values can be different, which violates the safety property. ∎
There is strong evidence that protocols like SCP can guarantee safety in any FBAS with quorum intersection despite faulty nodes (Álvaro García-Pérez and Schett, 2020; Lokhava et al., 2019; Losa et al., 2019; Cachin and Tackmann, 2019). This seems to be the case even if faulty nodes lie about their quorum sets (Álvaro García-Pérez and Gotsman, 2018).
FBAS instances are complex structures isomorphic to hypergraphs (s. Sec. 3.1). How can node operators and users determine whether a given FBAS is ”sufficiently secure”, or whether it is ”centralized”? We break these questions down into the more specific questions:
Which groups of nodes can compromise liveness?
Which groups of nodes can compromise safety?
What is the composition of the ”top tier”?
We will also discuss how the relevant properties can be determined in practice, introducing our software-based analysis framework. An example (manual) application of our analysis method is given in Sec. B.1, while in Appendix C we apply our analysis software to different snapshots of the Stellar network.
As per Thm. 3.4, an FBAS cannot enjoy liveness if it doesn’t contain at least one non-faulty quorum. Considering the state of the art in consensus protocols for the FBAS setting and their formal verification (s.a. Sec. 3.2), quorum availability despite faulty nodes is furthermore the only precondition to achieving liveness that depends on and arguably the most difficult to satisfy in a practical deployment. However, while quorum availability can easily be checked based on , faulty nodes are usually not readily identifiable as such in practice. We therefore propose, as a means to grasping liveness risks, to look at sets of nodes that, if faulty, can undermine quorum availability.
Let be the set of all quorums of the FBAS . We denote the set as blocking iff it intersects every quorum of the FBAS—i.e.,
For example: and are both blocking sets for .
Control over any blocking set is sufficient for compromising the liveness of an FBAS .
As intersects all quorums of the FBAS, there is no quorum that can be formed without cooperation by . Without at least one intact quorum, liveness is not possible as per Thm. 3.4. ∎
Notably, blocking sets can also block liveness selectively, enabling censorship. As nodes from the blocking set are present in every quorum, consensus will never be reached on any value that the blocking set opposes to. For example, in the context of Stellar, the blocking set could block the ratification of transactions involving specific accounts. We chose the term blocking in analogy to the v-blocking sets introduced in (Mazières, 2015). As an important distinction, we use the term blocking set to refer to a property of the whole FBAS , as opposed to a property of an individual node .
In the above example, and are not only blocking sets with respect to , they are minimal blocking sets, i.e., none of their proper subsets is a blocking set.
Within the set of node sets , a member set is minimal iff none of its proper subsets is included in —i.e., .
So, if are the quorums of an FBAS, are its minimal quorums and are its minimal blocking sets. The notion of minimal quorums is helpful, among other things, for efficiently determining whether an FBAS enjoys quorum intersection (Lachowski, 2019): it can be shown that an FBAS enjoys quorum intersection iff every two of its minimal quorums intersect. Similarly, a set that is blocking for all minimal quorums is blocking for all quorums in general, and the set of all minimal blocking sets of an FBAS equals the set of all minimal blocking sets for the minimal quorums of that FBAS. We defer the formal write-up of these corollaries (and their proofs) to Appendix A.
As per Thm. 3.6, an FBAS can only be considered safe (as one coherent system) as long as it enjoys quorum intersection despite faulty nodes, i.e., as long as each pair of its quorums intersect in at least one non-faulty node. In the face of state of the art protocols like SCP and the correctness proofs surrounding them (s.a. Sec. 3.3), we furthermore argue that, for practical purposes, quorum intersection despite faulty nodes is a sufficient condition for achieving safety in an FBAS. Hence, for assessing the risk to safety, it is interesting to identify sets of nodes that comprise intersections of pairs of quorums. We call a set of nodes that contains intersections of one or more quorum pairs a splitting set, as it can, if malicious, cause at least two quorum to diverge, splitting the FBAS.
Let be the set of all quorums of the FBAS . We denote the set a splitting set iff it contains an intersection of at least two quorums of the FBAS—i.e., there are distinct quorums and so that .
The existence of a faulty splitting set clearly violates quorum intersection despite faulty nodes and therefore, as per Thm. 3.6, threatens safety. Informally, a splitting set can conspire to agree to different statements in two quorums it is ”splitting”, causing the quorums to externalize conflicting values and in this way diverge. As with blocking sets, we are especially interested in finding the minimal splitting sets of an FBAS .
Similarly as with minimal blocking sets, it is sufficient to consider only minimal quorums when searching for minimal splitting sets. An intersection of two non-minimal quorums is either a non-minimal splitting set of the FBAS or identical to at least one intersection of two minimal quorums (we prove these and related corollaries in Appendix A).
For narrowing down notions of ”centralization” with respect to FBASs, we propose the concept of a top tier. Informally, the top tier is the set of nodes in the FBAS that is exclusively relevant when determining minimal blocking sets and minimal splitting sets, and hence the liveness and safety ”buffers” of an FBAS.
We define the top tier of an FBAS as the set of all nodes contained in one or more minimal quorums—i.e., if is the set of all minimal quorums of the FBAS, is its top tier.
Based on this definition, it can be shown that each minimal blocking set consists exclusively of top tier nodes, and each top tier node is included in at least one minimal blocking set. Similarly, each minimal splitting set consists exclusively of top tier nodes. We prove these statements in Appendix A.
Larger top tier sizes are likely to become a significant performance factor. While we are not aware of any published studies on the performance of SCP (the so far only production-ready protocol for the FBAS setting, to the best of our knowledge), we argue that SCP is similar to Practical Byzantine Fault Tolerance (PBFT) (Castro et al., 1999) with regards to message complexity—it can be shown to require no less than messages per agreement round for a smallest quorum of size . While PBFT-related consensus protocols are notorious for becoming unusable in larger validator groups, several improved protocols have recently emerged that target the blockchain use case and scenarios with 100 and more validators (Yin et al., 2019; Stathakopoulou et al., 2019). While an application of proposed improvements to the FBAS paradigm is conceivable, specific proposals towards this goal are not known to us at this time. As a possible avenue for future exploration—for FBASs with a symmetric top tier, existing permissioned protocols could be adapted without much modification.
The top tier of an FBAS is a symmetric top tier iff all top tier nodes have identical quorum sets—i.e., .
As we work out in detail in Appendix A, in FBASs with a symmetric top tier and a (for simplicity) non-nested top tier quorum set , it holds that any minimal blocking set has cardinality and any minimal splitting set has cardinality .
As noted in other works (e.g., (Lachowski, 2019; Bracciali et al., 2019)), determining core FBAS properties like quorum intersection quickly leads to tasks of exponential computational complexity. Nevertheless, previous works (Lachowski, 2019) also provide heuristics and algorithms for effectively determining properties like quorum intersection for non-trivial FBASs such as the Stellar network. We extend the related work on analyzing specific FBAS instances in the following ways:
We develop algorithms for newly proposed analyses, such as the listing of all minimal blocking and minimal splitting sets and the identification of a top tier.
We propose algorithms for efficiently dealing with edge-case FBASs, such as large FBASs that are suspected of not enjoying quorum intersection and FBASs with a symmetric top tier.
We contribute a simulation component enabling us to generate and study significantly larger numbers of interesting FBASs than previous works, which also provides insights into which FBAS structures and sizes are amenable to analysis.
We implement all proposed analyses and simulation mechanics in the form of an extendable software framework that we release as open source.
We give a very rough overview on creftype 1 and creftype 2 in the following. We will discuss creftype 3 and our experiences with regards to analysis scalability in the upcoming Sec. 5. For more information on creftype 4 and details exceeding what is presented here, we defer to our (documented) implementation111 https://github.com/wiberlin/fbas_analyzer .
We find minimal quorums using an adaptation of the algorithm proposed in (Lachowski, 2019). Most importantly, we implement an additional filtering step to ensure that no non-minimal sets are returned. We then determine minimal blocking sets using a novel branch-and-bound algorithm with similar structure as the algorithm for listing minimal quorums. We find minimal splitting sets by first gathering all pair-wise intersections of minimal quorums and then filtering out the non-minimal ones. We apply different heuristics and optimizations throughout all analysis steps. We conjecture that for an FBAS with nodes and a top tier of size our implementation finds all minimal quorums in (average case) to (worst case) and, once the minimal quorums are found, all minimal blocking sets in and all minimal splitting sets in . In practice and for larger FBASs, we observe that the listing of minimal quorums often completes significantly faster than the remaining analyses.
Our regular approach, adapted from (Lachowski, 2019), for validating whether an FBAS enjoys quorum intersection is based on finding all minimal quorums and subsequently checking if each two of them intersect. For larger top tier sizes , it takes prohibitive amounts of time to discover all minimal quorums. We therefore propose and implement a complementary alternative algorithm that is slower for FBASs with quorum intersection but potentially significantly faster for FBASs that lack quorum intersection. Our proposed algorithm follows the same branch-and-bound strategy and branch pruning logic used for finding minimal quorums. However, once a potentially minimal quorum is found, instead of storing , it is checked whether contains any quorums. This check can be completed in by recursively removing all unsatisfiable nodes until no more nodes can be removed, arriving at a . If contains nodes (so ), is a quorum. Since is also a quorum and , we have shown that the FBAS does not enjoy quorum intersection and can terminate the search. The original quorum listing logic is guaranteed to list all minimal quorums, so if all branches of the search terminate without uncovering two non-intersecting quorums, the FBAS can safely be assumed to enjoy quorum intersection. In Sec. 5.3, we use this algorithm to prove, within seconds, that a specific large FBAS doesn’t enjoy quorum intersection. For comparison, our regular quorum intersection check didn’t complete at all for the same FBAS (we terminated the analysis after several days of computation).
We also implemented a fast detector for symmetric clusters, i.e., sets of nodes in which each two nodes have the same quorum set. An FBAS that enjoys quorum intersection and has a symmetric top tier (Def. 4.6) contains exactly one symmetric cluster - its top tier. Our detection algorithm groups all nodes in by quorum set. If a group of nodes is thus found so that , the quorum set corresponding to is returned (as a representation of this symmetric cluster), allowing subsequent manual analysis. For ensuring that the FBAS enjoys quorum intersection, the unsatisfiability of the complement set is verified similarly as in Sec. 4.4.2.
The reported openness enabled through the FBAS paradigm comes at the cost of increased configuration responsibilities for node operators. As discussed in Sec. 3, each node must become associated with a quorum set (and thereby quorum slices) for becoming a useful part of an FBAS. We will refer to this process as quorum set configuration (QSC). But how should a node operator go about QSC? Based on the analytical toolset introduced in Sec. 4, we can now investigate what kinds of QSC policies are plausible and in what kind of FBASs they result.
A QSC policy is individually and repeatedly invoked for each node . It takes information about a current FBAS instance as input and returns a quorum set for , setting a new value for . Without loss of generality, we will discuss QSC policies resulting in non-nested quorum sets, i.e., quorum sets of the form (for a set of validators and a threshold ). Such QSC policies must output two values:
A set of validators
A quorum set threshold
What does a given choice of actually mean? We argue that this depends on semantics and distinguish two interpretations of ”QSC” that can shape the choice of : as an outlet for strategic considerations towards optimizing FBAS-wide properties, and as a tool for expressing individual preferences. Each of these interpretations implies a different set of QSC policies, examples for which we will discuss starting from Sec. 5.2.
We argue that , unlike , can be chosen in a way that is compatible with both strategic and individualistic QSC. Looking into the current practice, for belonging to distinct organizations, the Stellar software (stellar-core) sets to either or of (Lokhava et al., 2019). Lower values of are deemed unsafe by stellar-core. And for good reason, as, for example, a severely threatens quorum intersection. thresholds correspond to a well-established equilibrium for byzantine-fault tolerant systems. For non-nested quorum sets, they are actually calculated in stellar-core as 222 For nested quorum sets, the formula becomes . , which maximizes in the equation . This threshold calculation rule enables symmetric top tiers with members to arrive at an equilibrium common to protocols such as PBFT (Castro et al., 1999)—both safety and liveness can be maintained in the face up to node failures. thresholds, finally, tilt the scales maximally towards safety, at the total expense of liveness. When , only one node in needs to fail to prevent the satisfaction of the quorum set. Unless noted otherwise, we will always use thresholds in the subsequent discussion. This enables strategic node operators to achieve configurations with the same guarantees as classical BFT protocols. At the same time, it is also a completely plausible choice for individualistic policies—even more so since it corresponds to a current software default. Notably, using the same rule for calculating across all discussed policies also improves the comparability of results.
Our FBAS analysis tool includes a simulation component. We simulate the process in which the quorum set of each node is (re-)configured, based on specific QSC policies that we discuss in the remainder of this section. The simulation component produces an FBAS representation in JSON format that gets passed to the analysis component introduced in Sec. 4.4. The analysis component extracts quantitative data about the resulting FBAS. Namely, we determine their:
Liveness ”buffer”, based on the cardinalities of minimal blocking sets (cf. Sec. 4.1).
Safety ”buffer”, based on the cardinalities of all minimal splitting sets (cf. Sec. 4.2).
Top tier size (cf. Sec. 4.3), with its various implications—FBASs with small top tiers are more efficient and more amenable to analysis, however a small top tier also implies centralization.
We present analysis results using bar plots. Bar heights express the cardinality of the top tier and the average cardinalities of minimal blocking and minimal splitting sets. Error bars mark minima and maxima, i.e., the cardinalities of the smallest and largest minimal blocking and minimal splitting sets.
In QSC policies based on individual preferences, nodes should contribute local knowledge to the collective FBAS configuration. This can include local views such as:
Which nodes are trusted to be honest and high-quality (i.e., not faulty).
Which nodes are believed to be non-sybil (i.e., not controlled by the same organization).
Which nodes usually interact with the local node.
It is often implied that QSC should reflect some form of trust, e.g., in wordings such as ”flexible trust” (Mazières, 2015) or ”asymmetric distributed trust” (Cachin and Tackmann, 2019). Reasoning about the future behaviour of nodes in the context of a consensus protocol might be an overwhelming task for node operators, however. Encoding beliefs about non-sybilness (Douceur, 2002), e.g., by grouping nodes believed to belong to the same entity or organization in inner quorum sets, might be easier in comparison. For individual nodes, quorum intersection is especially relevant with respect to the group of nodes they most frequently interact with, lest they end up with, e.g., diverging ledgers in the event of a fork. Adding nodes of organizations one interacts with to one’s own quorum sets appears to be a prudent strategy for staying ”in sync” with them (Lokhava et al., 2019).
In the following discussion, we will use graph representations for getting a grip on the fuzzy notion of ”preferences”. It is an intriguing hypothesis that the FBAS paradigm might be used for realizing sybil-resistant and yet energy-efficient permissionless consensus by bootstrapping the quorum structure along an existing trust graph, social graph or interaction graph. In Sec. 3.1 we saw that transforming an FBAS to an equally sized graph must, in general, lose information, i.e. can yield only heuristic representations. In Sec. 5.3 and Sec. 5.4 we pose the inverse question: How can a ”good” FBAS be instantiated from a given graph ?
We can evaluate graph-based policies using real-world graph data as well as synthetic graphs. In the following sections we will present results based on two snapshots of the autonomous system (AS) relationships graph inferred by the CAIDA project333 The CAIDA AS Relationships Dataset, 1998-01-01 (serial-1) and 2020-01-01 (serial-2), https://www.caida.org/data/as-relationships/ . The topological structure of the Internet has repeatedly been cited as an argument for the viability of the FBAS model (Mazières, 2015; Lokhava et al., 2019). The two snapshots we chose are from January 1998—the earliest available snapshot describing a younger Internet with ASs connected via (directed) customer/provider links and (undirected) peering links—and from January 2020—with ASs connected via customer/provider links and peering links. We will refer to the graphs as and .
Our simulation tool also supports the generation of synthetic graphs, using established approaches such as the Barabási–Albert (Albert and Barabási, 2002) model for generating scale-free graphs. However, our experiments so far did not result in a theoretically founded graph generation approach that consistently produces suitable graphs. Top tiers emerging from synthetic graphs were often vastly big or not sufficiently intraconnected to result in FBASs that are simultaneously safe, interesting and feasible to analyze. In the interest of space, we will therefore focus exclusively on our experiments involving the AS graph.
Strategic considerations implies that node operators are interested in maintaining and improving measurable global FBAS properties. QSC policies that see achieving quorum intersection as the main goal of QSC (like (Chitra and Chitra, 2019) and the QSC sketch in (Lachowski, 2019)) fall in this category. A straightforward strategy for tweaking properties like safety, liveness and top tier size is to mimic a permissioned system with an appropriately defined top tier. For example, for maximizing safety and liveness ”buffers”, we can set the top tier to , the set of all nodes in the FBAS:
|(Ideal Open QSC)|
As a consequence of the previous discussion on FBASs with a symmetric top tier (cf. Sec. 4.3 and Thm. A.9), for the application of Ideal Open QSC results in FBASs with the same safety and liveness guarantees as classical BFT systems (i.e., both all minimal blocking sets and all minimal splitting sets have size ). We must, however, assume that all nodes in the FBAS are at least non-sybil. Ensuring the non-sybilness of nodes is notoriously difficult in open systems without universally trusted authorities (Douceur, 2002). For circumventing this challenge, strategic node operators might choose to limit openness by agreeing on a top tier beforehand. Assuming that all submit to the same choice of :
|(Ideal Permissioned QSC)|
We must sidestep the question of how the preselected top tier should be, in the end, selected. If governance mechanisms are in place that facilitate a convergence on (e.g., through ”real-world” negotiation or some ingenious ranking protocol), there is arguably no need for applying the FBAS paradigm and involving the complexities of FBAS protocols. The existence of a well defined set of validator nodes (like ) is a sufficient condition for being able to use efficient ”permissioned” consensus protocols like (Yin et al., 2019).
Interestingly, an unambiguous top tier can also emerge organically in networks that grow iteratively, i.e., node by node. Consider something as simple as random validator selection with a fixed target number of validators :
If QSC is performed based on the current network state and never repeated once a node’s quorum set has become ”large enough” (i.e., ), the first nodes to join the network consistently become the top tier . They furthermore form a symmetric top tier (Def. 4.6), as they all stop adapting their quorum sets once exactly nodes have joined. Since every new node using this policy includes older nodes into its quorum set, no new minimal quorums can emerge. We validate this effect empirically using simulations. Fig. 1 depicts obtained results for and (inspired by current values in the Stellar network) and arbitrary large . Bar heights express the cardinality of the top tier and the average cardinalities of minimal blocking and minimal splitting sets. The empirical results match the analytical expectation for FBASs with symmetric top tiers (cf. Sec. 4.3 and Thm. A.9). Among other things, top tier sizes of the form induce equally sized safety and liveness ”buffers”, while in other cases safety profits earlier than liveness from increases in top tier size.
Random validator selection has also been proposed in (Chitra and Chitra, 2019), however it was not evaluated with the temporal component of ”choosing only from nodes that have joined before”. Random validator selection is not sybil-proof as nodes are picked from all (an open set without admission control). However, if one can assume that the first nodes to join the network are non-sybil, the potential damage from sybil nodes joining later in time is limited.
We consider a QSC policy individualistic if it is based on individual preferences, i.e., in our model (cf. Sec. 5.1.4), on edges in a preexisting relationship graph . As a simple representative of this class, we propose to instantiate quorum sets directly from neighborhoods in , incorporating the notion that only nodes with whom a direct relationship exists are worthy of inclusion in a quorum set:
|(All Neighbors QSC)|
If is not connected, we can’t have quorum intersection (and hence safety). If is a complete graph, we furthermore get the same result as with Ideal Open QSC. We applied this policy to the two AS graph snapshots and . The resulting FBASs could not, for the most part, be analyzed within a reasonable time frame. Using our alternative quorum intersection check algorithm (cf. Sec. 4.4.2), we were able to determine that the FBAS instantiated from lacks quorum intersection. The high prevalence of AS peering in today’s Internet is a likely explanation for why sufficiently well intraconnected clusters can emerge outside of the ”natural” top tier for that graph.
Strategic QSC requires out-of-band coordination for sybil resistance while individualistic QSC results in FBASs that are hard to analyze and easily turn out unsafe. How can individualistic policies (the possibility for which is arguably the key selling point for the FBAS paradigm) be adapted towards strategically better outcomes?
We observe that nodes can be distinguished by tierness, or relative centrality in the graph. Tierness is an established notion for ASs in the Internet graph. For FBASs, a tiered quorum structure with every node including only higher-tier neighbors in its quorum sets was proposed as early as in the original FBAS proposal (Mazières, 2015). We conjecture that if nodes include only their higher-tier neighbors in quorum sets and the highest-tier nodes are reasonably well interconnected444 Notably, this is not always the case in synthetic scale-free graphs generated using the Barabási-Albert model. , we are likely to end up in an FBAS with quorum intersection. Picking nodes based on their tierness is also related to the quality-based configuration format currently used by the Stellar software (Lokhava et al., 2019).
In the following, we make the additional assumption that nodes can infer the relative tierness of their graph neighbors, i.e, that they can, with reasonable room for error, determine which of their neighbors are of a higher tier than themselves. For our simulation-based exploration of this policy class, we use the PageRank (Page et al., 1999) score of nodes (calculated without dampening) as a proxy for their tierness. With denoting the PageRank score of node and , each simulated node determines its higher-tier neighbors and same-tier neighbors (”peers”) as follows:
We can then formulate following QSC policy:
|(Higher-Tier Neighbors QSC)|
The results of applying this policy to the AS graph snapshots and are depicted in Fig. 2. Like in the rest of this paper, bar heights express (average) cardinalities. For , error bars mark the cardinalities of the respectively smallest and largest minimal blocking and minimal splitting sets. The FBAS resulting from was not amenable to deeper analysis—in the time available for this experiment (around 7 days), our analysis software could precisely determine only the minimal quorums and, consequently, top tier of the FBAS (after around 30 minutes). The plotted error bars represent lower and upper bounds calculated based on the found minimal quorums. Note that the FBAS enjoys quorum intersection. Concerning the FBAS bootstrapped from , it is noteworthy that at a top tier size of , minimal blocking sets have an average cardinality of only , with the smallest ones having cardinality . For comparison, a symmetric top tier with would result in all minimal blocking sets having size . This liveness-threatening discrepancy can be explained through cascading failures: If (for example) two nodes fail, this can result in a third node with a ”weak” quorum set becoming unsatisfiable, so that three nodes have now de-facto failed, which can result in a fourth node becoming unsatisfiable, et cetera. We very frequently see the same cascading effects when instantiating FBASs from synthetically generated scale-free and small-world graphs. It seems that the composition and size of smallest blocking sets for an FBAS is heavily influenced by the ”weakest” quorum sets in the FBAS’ top tier. An additional example for cascading failures is given Sec. B.2.
The graph-based QSC policies discussed so far frequently result in systems that are brittle (in the sense of small minimal blocking sets) and hard to analyze. Both of these characteristics are vastly improved, relative to top tier size, in FBASs with symmetric top tiers. However, symmetric top tiers emerge organically from a preexisting relationship graph only if the top tier nodes (selected, e.g., based on Tierness Heuristics) form a complete subgraph of , which is not the case in the graphs investigated so far.
In the following, we propose a mixture between the mainly individualistic policy Higher-Tier Neighbors QSC and a strategic element aiming at creating symmetry in the top tier. Namely, we propose a policy in which nodes believing themselves to be top tier mirror the quorum sets of other apparently top tier nodes, strategically including non-neighbors in their quorum sets for improving the global FBAS structure:
|(Symmetry-Enforcing Higher-Tier Neighbors QSC)|
Like in Higher-Tier Neighbors QSC, nodes choose their validators based on connections in and Tierness Heuristics. Newly, each node extends its thus chosen validators by all validators that potential top tier nodes in have currently configured. A node is considered a potential top tier node if it includes as a validator, or . Since (otherwise ), can only happen if , i.e., acts on the assumption that it is a top tier node. After several rounds of QSC, and if is sufficiently well connected, the FBAS converges to a top tier with .
As can be observed from our analysis results in Fig. 3, this is in fact the case when instantiating from the AS graph snapshots and . In the case of , we see that by enforcing symmetrical top tiers with an appropriate common quorum set threshold, minimal blocking set sizes (and thus the FBASs’ ”liveness buffer”) can be significantly increased. By employing our algorithm for symmetric cluster detection (cf. Sec. 5.4.2), we were furthermore able to complete all analyses in seconds, even for the -bootstrapped FBAS with a top tier of size .
Of course, by making validator decisions independent of the local knowledge representation , we need new assumptions to be able to assume a resistance to sybil attacks. If nodes implement a policy like Symmetry-Enforcing Higher-Tier Neighbors QSC to the letter, any top tier node can introduce arbitrary numbers of sybil nodes into the top tier. This kind of strategy is therefore only sensical if it can be assumed that nodes in can detect such attempts. While both the set of required ”vigilant operators” and the set of potential ”attackers” (that such operators need to be vigilant about) is reduced by converging to a top tier , the requirement for vigilance and the potential manual verification of quorum set changes is still present. Given the lack of explicit incentives for running validator nodes in systems like Stellar, such a burden on the operators of top tier nodes might be viewed as problematic (Kim et al., 2019). However, similar critique can also be voiced against systems (like Bitcoin) that base their security arguments on notions of economic rationality, as economic rationality can also be leveraged by attackers (Ford and Böhme, 2019).
The FBAS paradigm reportedly enables the instantiation of consensus systems with open membership (Mazières, 2015; Lokhava et al., 2019). And clearly, arbitrary nodes can join an FBAS, causing new quorums to be formed that contain them. Based on the preceding discussion, however, we recognize that without creating a new, de-facto disjoint FBAS, or the active reconfiguration of existing nodes, new nodes cannot become part of minimal quorums, and hence minimal blocking sets and minimal splitting sets. In other words, their existence is irrelevant as far as the so far discussed safety and liveness indicators are concerned. In Sec. 4 we defined the notion of a top tier to reflect the set of nodes in an FBAS that is not irrelevant for safety and liveness, i.e., the set of nodes from which minimal quorums, blocking sets and splitting sets are formed. The top tier wields absolute power to censor and block the whole FBAS, and malicious subsets of the top tier are both indispensable and sufficient for causing forks and double spends.
In the following, we investigate the question to what extend this top tier can be considered a group with open membership. How can its power be diluted, by promoting additional nodes to top tier status? Can nodes be ”fired” from the top tier? We make the case that, in general, a top tier can neither grow nor shrink without either the active involvement of existing top tier nodes or a loss of safety guarantees. We base all subsequent projections on the status quo of an FBAS that enjoys quorum intersection despite faulty nodes (a safe FBAS as per the discussion in Sec. 3.3).
As a preliminary remark, recall that, as per Def. 4.5, we define the top tier of an FBAS as the union of all its minimal quorums. is therefore also a quorum and intersects every quorum in .
Let be the top tier of an FBAS that enjoys quorum availability and quorum intersection. Then it is possible, without compromising neither quorum availability nor quorum intersection, to instantiate a new top tier by changing only the quorum sets of new and old top tier nodes .
Let be the goal top tier. Let be a modification of so that 555 Without loss of generality. Clearly, more robust top tier constructions are possible. and . As is a quorum w.r.t. , enjoys quorum availability. Therefore, enjoys quorum availability. does not enjoy quorum availability, because no node in in satisfied without and no node in can form a quorum without a node from (otherwise would no have been the top tier w.r.t. , cf. Def. 4.5). There are therefore no quorums w.r.t. that are disjoint of . therefore enjoys quorum intersection iff enjoys quorum intersection, which it (trivially) does. ∎
The situation is less clear if some nodes do not wish to leave . Note, however, that single nodes can always endanger safety via trivial configurations such as . If performed by one or more nodes in , such an act of sabotage can easily have an impact on the safety of large portions of the FBAS.
In the following, we assume a ”self-centered” top tier in the sense that all top tier nodes include only other top tier nodes in quorum sets. Symmetric top tiers (Def. 4.6) have this property, as do top tiers observed in the wild in the Stellar network (cf. Appendix C).
Let be an FBAS that enjoys quorum intersection and has a ”self-centered” top tier such that all top tier quorum slices are comprised of only top tier nodes (). Then it is not possible, without compromising quorum intersection, to instantiate a new top tier by changing only the quorum sets of non-top tier nodes .
Let be the top tier of a new FBAS that enjoys quorum intersection. Let and be the sets of all minimal quorums of and , respectively. As per Def. 4.5, implies that .
Assume there exists a . Then is a quorum w.r.t. and either (a) not a quorum w.r.t. or (b) not minimal w.r.t. . However, we require that the quorum sets of top tier nodes don’t change: . Therefore is a quorum also w.r.t. , contradicting (a). Hence, (b) must hold and there must be a such that (cf. Def. 4.3). As , being a quorum w.r.t. implies it also being a quorum w.r.t. . But then is not minimal w.r.t. , implying and thus again leading to a contradiction. This proves that .
Assume now there exists a and let . As enjoys quorum intersection, and contains members of the ”old” top tier . is a quorum w.r.t. , but cannot be a quorum w.r.t. as otherwise would not be a minimal quorum. There must therefore exist a node with a quorum slice such that (cf. Def. 3.3), i.e., . As , we require that and , which leads to a contradiction since and . It must therefore hold that , and . ∎
Who determines which FBAS nodes get to form the top tier? Our results imply that, if maintaining safety is seen as an untouchable requirement, the top tier of an FBAS at ”iteration” is legitimated by decisions of, exclusively, members of (if none of them cooperates, we lose safety, if all of them cooperate, we don’t). Because of the top tier’s determining importance to the safety, liveness and performance achievable within a given FBAS, open membership in if of little benefit without open membership in .
How closed is the membership in ? It might be sufficient that only some nodes in support a transition to . If reactive QSC policies like Symmetry-Enforcing Higher-Tier Neighbors QSC (Sec. 5.4) are used, for example, one cooperative top tier node might already be enough for growing the top tier in a way that is robust and doesn’t only dilute the relative influence of . How partially supported top tier changes would play out must be investigated based on more specific scenarios. We expect the safe ”firing” of top tier nodes to be especially challenging.
Which begs the question—can the safety requirement be weakened? For example, given sufficiently good (out-of-band) coordination between members of , a might be instantiated in which at least enjoys quorum intersection. It is conceivable that novel protocols can be developed, possibly also leveraging the FBAS structure, that reduce the notorious difficulty of such coordinated bottom-up actions.
We demonstrate in this paper that, despite the complexity of the FBAS model, the properties of concrete FBAS instances can be described in a way that is both precise and intuitive, and allows comparisons with more classical byzantine agreement systems. We propose the notions of minimal blocking sets, minimal splitting sets and top tiers to describe which groups of nodes can compromise liveness and safety and out which group of ”central” nodes they are drawn. While performing exact analyses involves computational problems of exponential complexity, heuristics and appropriately engineered algorithms make it possible to analyze a wide range of interesting FBASs. Using an analysis framework we specifically developed for this work, we also tackle the question of how individual configurations result in global properties. We find that overly strategic configuration policies result in FBASs that are indistinguishable from permissioned systems. Individualistic approaches, on the other hand, cannot guarantee safe results while quickly resulting in systems that are infeasible to analyze. Adding some strategic decision-making at organically emerging top tier nodes offers a potential middle way towards robust FBASs instantiated from the sum of individual preferences.
Independently of the way in which a given FBAS came to be, however, the composition of a once established top tier cannot be influenced without the cooperation of existing top tier nodes, without at the same time threatening safety. This seems to place the FBAS paradigm closer to the ”permissioned consensus” camp than hoped. More investigation is needed to determine the exact impact of bottom-up top tier changes (as in number of nodes affected by a loss of safety or liveness, for example) and to formulate possible coordination strategies to keep such impacts low.
Let be the set of all quorums of the FBAS , be the set of all minimal quorums. All pairs of intersect iff all pairs of intersect.
Since , trivially implies that . The other direction follows because ( being the set of all minimal sets w.r.t. ; s.a. Def. 4.3). If all pairs in intersect, so must therefore all pairs in . ∎
This was previously also shown in (Lachowski, 2019).
Let be the set of all quorums of the FBAS , and be the set of all minimal quorums. If is a blocking set for , then it is also a blocking set for .
is a blocking set for (Def. 4.1). , so that is also a blocking set for . ∎
Let be the set of all quorums of the FBAS , and be the set of all minimal quorums. If is blocking set for , then it is also a blocking set for .
Let be the set of all quorums of the FBAS , be the set of all minimal quorums, and be the set of all minimal blocking sets. Then each minimal blocking set of the FBAS is minimally blocking w.r.t. , i.e., intersects every minimal quorum and no intersects every minimal quorum .
Let be the set of all blocking sets w.r.t. . Based on Cor. A.2 and Cor. A.3, is exactly the set of all blocking sets for . Hence the set of all minimal sets w.r.t. is exactly the set of all minimal blocking sets w.r.t. and therefore the set of all minimal blocking sets for , or . Likewise, as is the set of all blocking sets w.r.t. , is the set of all minimal blocking sets w.r.t. . ∎
Let be the set of all quorums of the FBAS , be the set of all minimal quorums, and be the set of all minimal splitting sets. Then for each minimal splitting set there are two such that .
Note first the extreme case that does not enjoy quorum intersection. Then and, as a consequence of Cor. A.1 there are such that . In the general case, for each there are such that (Def. 4.4). Let such that and . and must exist as is the set of all minimal sets (Def. 4.3) w.r.t. . Let . , because and . If , would not be a minimal splitting set. Therefore , i.e., there are such that . ∎
Let be the top tier of an FBAS , and be the set of all minimal blocking sets of . Then .
Let be the top tier of an FBAS , and be the set of all minimal blocking sets of . Then for each top tier node there is at least one minimal blocking set such that .
Let be an arbitrary top tier node and an arbitrary minimal quorum such that (recall that as per Def. 4.5). intersects every , as otherwise there would be a such that (i.e., would not be a minimal quorum). Therefore, is a blocking set w.r.t. and is a blocking set w.r.t. . is not a blocking set w.r.t. because it doesn’t intersect . Hence, all such that (and there must be at least one——because is a blocking set w.r.t. ) must contain . Hence the FBAS has at least one minimal blocking set that contains . ∎
Let be the top tier of an FBAS , and be the set of all minimal splitting sets of . Then .
For an FBAS with a symmetric top tier , such that it holds that:
All minimal blocking sets have cardinality .
All minimal splitting sets have cardinality
ad creftype 1: For all with it holds that . Hence, no is a quorum, there are no quorums that are disjoint with and is a blocking set (Def. 4.1). is furthermore a minimal blocking set, as for any it holds that is a quorum (as ), and so is not a blocking set.
ad creftype 2: Let be an arbitrary minimal splitting set for . If , there exist two minimal quorums (with cardinality ) that do not intersect. There is then only one and the cardinality of all minimal splitting sets is trivially . In the following, we assume that and therefore enjoys quorum intersection. As per Cor. A.5, there are at least two minimal quorums such that . Let . must be empty, otherwise we could, with an arbitrary find a minimal quorum such that (i.e., is not minimal). It therefore holds that and, since, , . ∎
For illustrating aspects of our proposed analysis approach, we will now present and discuss two simple example FBASs. We will use integers to represent distinct nodes, so that implies a node population consisting of the nodes to . We will first completely (and manually) analyse a -node FBAS based on the metrics and approach discussed in Sec. 4 (also with the help of corollaries from Appendix A). We will then discuss the phenomenon of cascading failures (observed in Sec. 5.4) based on a suitable -node FBAS.
Consider the FBAS with and such that: