Cross-blockchain transaction remains one of the most challenging problems in blockchains [dzhao_cidr20]. The root cause of the challenge lies in the nondeterministic nature of blockchains: A transaction across multiple blockchains might be partially rolled back due to the potential forks in any of the participating blockchains—eventually, only one fork will survive in the competition among miners. While some effort has recently been made to developing hierarchically distributed commit protocols [dzhao_arxiv2002_cbt, vzakhary_arxiv19, mherlihy_podc18] to make multi-party transactions progress, they lack a theoretical foundation to guarantee the completeness of the transactions, either finally committed or aborted. That is, there are no systematic methods to reason about the transaction results.
This paper tackles this problem from a perspective of point-set topology. We construct a topology space for the transactions among blockchains, and show that this space is homeomorphic to another topology space of static fork graphs induced by those transactions. We then construct a time-varying counterpart of the static fork graphs, namely the growing-fork topology space, and show that a continuous function exists from the growing-fork topology to the static forks. Combined together, these tools allow us to reason about the cross-blockchain transactions through the growing-fork topology, an intuitive representation of blockchains. To the best of knowledge, this work is the first study on point-set topological properties of blockchains.
The remainder of this paper is organized as follows. We review basic concepts and recent studies of blockchains and topology in §2. The system models and assumptions of the proposed topological approach to cross-blockchain transactions are provided in §3. We detail the construction of topology spaces for transactions and blockchain forks in §4. The relationships among the topology spaces are then illustrated through multiple continuous functions in §5. We finally conclude this paper in §6.
2 Background and Related Work
Basic Structure. A blockchain is a linkedlist replicated on multiple machines. Machines are usually also called nodes, miners, or participants. The element of the linkedlist is a block of multiple transactions, each of which involves two encrypted addresses and a specific number (e.g., the Bitcoin amount to transfer). The block’s overall data, plus a puzzle number, called a nonce, are hashed into a value that is passed to the next block, which will do the same and pass along the hashed value. Therefore, a linked list of “locked” blocks is formed through the compounded hash values.
Mining. The procedure to solve the puzzle, i.e., find the nonce, is called mining; For instance, Bitcoin requires that the hashed value of the nonce and current transactions is below some threshold. Because of this inequality requirement, the qualified nonce value is not unique; It is possible that two or more nodes solve the puzzle at the same time, and this is allowed in blockchains. The consequence is then there might exist more than one linkedlists, called forks. Depending on implementations, the number of forks is expected to reduce back to one. For instance, Bitcoin achieves this by eliminating those forks that are shorter than others. Initially, a blockchain has only one linkedlist, initiated by the very first block, called the genesis block.
Pools. When a new block is appended to the linkedlist, a system reward plus some transaction fee will be transferred to the (encrypted address) of the miner. To improve the chance to win the race of mining, a miner can choose to join a pool of other miners to share the reward with the members in the pool. Since the success of making a profit is all about competition, it is not uncommon for a pool to launch an attack again other pools.
2.2 Modeling Blockchains
There are some work leveraging game theory to study the behaviors of blockchains, especially the pooling formation in cryptocurrency. In[ieyal_sp15], Eyal studied how one pool decides to attack another pool by sending spy nodes through a nonoperative game. In a similar spirit, Tsabary and Eyal [itsabary_ccs18] showed that a game-theoretical model could be built for the scenarios when transaction fees play a more important role in miner’s behaviors.
Group theory. Speaking of blockchain pools, another branch of study was to take a group-theoretical approach [dzhao_arxiv2002_gt]. Essentially, the “movement” of miners is modeled as a permutation of the set associated with the blockchain. Such movement was then shown to be group operation: closed, associative, and inversable. Therefore, a lot of group properties are immediately available to the pool group of blockchains.
2.3 Cross-Blockchain Operations
The first study on operations among an arbitrary number of blockchains was published in [mherlihy_podc18]. Herlihy showed that a simple timeout scheme could enable the atomicity in a multi-party operation. However, the operation is not necessarily a transaction: partial changes are still possibly committed. Admittedly, it is arguable that not all applications require strong transactional properties. The latest findings in this direction can be found in the followup work [mherlihy_vldb19].
Another branch of study over cross-blockchain operations indeed focuses on transactions. A recent work called AC3 [vzakhary_arxiv19] employs an extra component (known as witness blockchain) to govern the cross-chain operations. Although the witness blockchain is comprised of the nodes from existing blockchains, still, these virtual nodes on the witness blockchain become the critical components of the entire ecosystem. To overcome this issue, a completely distributed commit protocol was proposed in [dzhao_arxiv2002_cbt].
2.4 Topology and Distributed Systems
We conclude this section by reviewing some basic topology concepts and theorems. A topology of a set is a collection of subsets of , denoted . One example topology of is then the power set of , , which consists of all the possible subsets of . This is also called the discrete topology of . The tuple is called the topology space of . If the context is clear, we often refer to to indicate space . Each of the subsets from is called an open set, and the complement set is a closed set by definition. A function from space to is called continuous if is an open set in , then is an open set in . The composite of two continuous functions is also continuous. If both and are continuous, we call a homeomorphism. Because a homeomorphism is defined purely on open and closed sets, two topology spaces are considered equivalent if such homeomorphism exists. Usually, we expect to migrate a complex problem in one topology space to another such that the problem can be solved more efficiently or more intuitively. The aforementioned concepts and techniques are also referred to as point-set topology.
Point-set topology is the main technique used in this paper; leveraging point-set topology in distributed computing contexts dated back in 1980’s [balpern_ipl85] and recently revived in [tnowak_podc19]. We will show how to reason the cross-blockchain transaction problem with a series of continuous functions constructed from the more intuitive topological properties on the blockchain forks. To the best of our knowledge, this paper is the first work taking a point-set topological approach to study multi-party transactions among blockchains.
In addition to point-set topology, there are algebraic-topological methods to study those problems that are better modeled in a geometric sense. These methods, usually categorized into homotopy groups and homology groups, study the “smaller” pieces of the targeting objects and try to “map” the geometrical objects into algebraic objects, such as groups. The smaller pieces are loop curves and triangle patches (simplexes, more formally) for homotopy groups and homology groups, respectively. Then, the equivalence between two topology space can be investigated by checking the algebraic groups, usually through homomorphic groups. A good review of algebraic topology methods applied to distributed computing can be found in [mherlihy_book13]; some of the hardest problems were shown to be elegantly solvable through algebraic topology [mherlihy_dc13, mherlihy_podc10, rguer_tcs09, mherlihy_jacm99]. This paper does not touch any of these algebraic topology methods, although we might explore them in the near future.
3 System Models and Assumptions
We assume each blockchain has a nontrivial number of nodes. That is, each blockchain is a cluster of nodes. We will use blockchain and cluster interchangeably. The set of blockchains is denoted , where each blockchain is , , .
We assume each cluster can spawn an arbitrary number of forks. Although two forks are most commonly seen in cryptocurrency, it is not uncommon the have more forks if the underlying consensus is customized for domain-specific usage, e.g., scientific data provenance [aalmamun_bigdata18]. A blockchain fork is defined as follows in this paper.
Definition 3.1 (Blockchain Fork).
Let denote the set of all forks, each of which, denoted , resides on a cluster . Initially, all ’s have a single fork, initiated by the genesis block. When () nodes succeed in a specific round, the blockchain will spawn new forks (). We use to denote the complement set of fork in . Each fork will have one of the following three states. is called eliminated if any of other forks suppress . is confirmed if all other forks are eliminated. Any other forks are called undecided.
With this definition, each fork can be categorized into one of the three possible states. Therefore, we can construct a fork graph among the elements of with three types of vertices/nodes. Note that this graph is static; we speak of nothing about the timestamp or step number in this definition. Now we are ready to define the transactions among these forks.
Definition 3.2 (Transaction Proxy).
A transaction proxy on a blockchain, or, equivalently, a fork graph, is a node of that fork graph where the transaction is carried out. A proxy is called live if its fork is pending or confirmed.
Therefore, a transaction is a collection of states on the proxies from the fork graphs. The transaction is time-oblivious since users are only interested in the final result of the transaction: commit, pending, or abort.
Nevertheless, a blockchain is indeed a dynamic data structure that changes over time. To this end, we introduce the extended concept of the fork graph, the so-called growing-fork graph.
Definition 3.3 (Growing-fork Graph).
A growing-fork graph, , is a sequence of fork graphs associated to the cluster . Let denote the fork graph of cluster at time , then
With the above terms defined, we are ready to construct the topology spaces associated to the transaction proxies, the static fork, and the real-time blockchain forks.
4 Topology Spaces
4.1 Topology Space of Cross-Blockchain Transactions
Let denote the set of clusters each of which represents a blockchain, . Define the ternary distance between any pair of elements in , and , as a map as follows:
We illustrate the three scenarios in Figure 1
. The two proxies in the green transaction can both commit since there is no fork at the moment. The two proxies in the orange transaction can safely abort because both forks fromand are to be eliminated. We cannot decide the result of the yellow transaction because there are forks involved. A transaction might involve more than two clusters, say , . The states of a -party transaction, i.e., a -cluster transaction, is similarly defined as two parties.
is a metric on .
Obviously, by definition. It remains to show , for arbitrary , .
If , it means both and abort. If commits, then
If aborts, then
If , then we know one blockchain commits the transaction and the other aborts. Without loss of generality, we assume commits and aborts in the following. If commits, then
If aborts, then
If , meaning that both and commit, it is also easy to check the triangle inequality. If commits, then
If aborts, then
Therefore, again, . ∎
Topology space on is induced by .
Let , then the open ball around the origin , an open set in the topology. Because all the possible distances between any clusters are at least , this -ball splits the cluster into a collection of single-element sets, each of which is a set of a single cluster , . That is,
Evidently, is basis of : every element in exactly belongs to a element in , and an open set in the topology is simply an arbitrary union of . ∎
Having constructed the topology of transactions among blockchains, we will start building a topology from the perspective of blockchain forks.
4.2 Topology Space of Fork Graphs
Definition 4.2 (Fork Distance).
When two transaction proxies are both live on two fork graphs and , respectively, the fork distance is defined as
If any of the two proxies is not live, we define .
As an example, we calculate some fork distances of transactions on Figure 1 as follows ( and representing the fork graphs of and , respectively):
Now, we will show that is a metric on the set of fork graphs.
Fork distance is a metric on .
If any proxy is not live, say that on , then (i) , (ii) , and (iii) .
If both proxies are live, then:
Now we are ready to construct the topology of fork graphs.
Proposition 2 (Fork Space).
The topology space over the fork set is induced by .
Let . Then let be an open ball around the center of with a radius of , namely an -ball. We will show that these -balls form a discrete topology: every open set induced by an open -ball is a singleton subset of exactly one fork graph in . That is, we need to show that for any pair of fork graphs. Note that by definition, . Therefore, it suffices to show that . Indeed, that is how we construct , because
Therefore, is “small” enough to split the space into a series of elements whose distance exceeds the boundaries of those -balls. That is, we now have a collection of singleton open sets each of which comprises exactly one fork graph:
is a basis because any element in belongs to a subset of and the intersection between any two subsets is empty. ∎
Now we are ready to study the time-varied topology in terms of blockchain forks, the so-called growing-fork graphs. Essentially, we extended the fork space by the time dimension.
4.3 Topology Space of Growing-fork Graphs
We extend the fork graph with a timestamp to model the real-time fork topology. A fork graph at time is denoted ; the set of infinite fork graphs is denoted , by the naming convention of point-set topology. That is, . then represents the ever-growing fork topology of cluster .
As before, we first define the metric over the growing-forks.
Definition 4.3 (Growing fork distance ).
where , , . Essentially, tracks the smallest possible number of forks between two growing-fork graphs once any of the two starts forking. Note that all blockchains initially have a single fork initiated by the genesis block.
To make matters more concrete, we illustrate the metric in Figure 2, where there are three blockchains () in the first three steps (). Here are some example calculations:
As before, we will prove that is indeed a metric:
is a metric on .
by definition. It remains to show . Define to be the smallest index such that .
If , it means .
If , then we know that starts to fork earlier than and . That makes
Note that since both and start forking at the same time, we have
If , this means starts to fork after and . Then the two distances and depend on . We can similarly calculate that
and draw the same conclusion as above.
If , then all three growing-forks start forking at the same time. Then we have
Similarly, we also have
Therefore, we have
The triangular inequality is thus satisfied.
If , without loss of generality we assume in the following.
If , , or , obviously we have
If , then we know
So, we have
If , then we know
So, we have
Therefore, again, the triangular inequality is satisfied.
Lastly, we show that defined as such induces the topology over .
Topology space on is induced by .
Let , , , where is the smallest time index such that in the infinite sequence . Then an open ball is fine enough to isolate each element in because:
for any . Therefore, we found a basis of space :
Now we have constructed three topology spaces for time-varied forks, for static forks, and for transactions among blockchains. In the next section, we will extract the relationship across these three spaces.
5 Commutative Morphism from Growing-fork Space to Transaction Space
We will show that a commutative morphism exists among the three spaces as the following. That is, we will construct and , respectively, and show that both maps are continuous. We can then reason about the transaction status by studying the topology of growing-fork topology using the composite function .
5.1 Homeomorphism between Fork Space and Transaction Space
Let denote the basis for the fork space defined in Proposition 2, and denote the basis for the transaction space in Proposition 1. We use to indicate the topology space induced from , i.e., . Similarly, we overuse to indicate the topology space induced from , i.e., , if it is clear from the context. We will construct a bijective function , and show that both and are continuous.
Let be an open set. This, by definition, means that is an arbitrary union of definite intersections among elements in . Because is discrete, is simply a set of arbitrary selections of ’s, denoted , where if , and . Define to be the index set: , and denote . Now, we define to be the set with the same set of indexed clusters: . Then, we construct the map as .
is a homeomorphism between and .
We need to show that and are (i) bijective and (ii) continuous.
is bijective. If two elements are distinct and map to the same , then the subsets and in the index set consist of the same indices, which means . Therefore, is injective. Conversely, for any , there must exist a subset by definition. As a consequence, there must be an element in that is induced by the subset , again by definition. Therefore, is surjective. As a result, is bijective.
is continuous. , an open set in the transaction space, we will show that is an open set in the fork space . Because is bijective, we know there must exist a unique value for . Let , . Note that is a discrete topology; therefore is an open set in .
is continuous. The proof is similar to , we skip it here.
Essentially, there is an equivalence between the transaction status and the transaction’s proxies on involved clusters in terms of forks, from a topological point of view. Therefore, we just proved a stronger result than we need:
Next, we will show an equivalence between such a “static” fork space and the entire ever-growing forks, topologically.
5.2 From Growing-Fork Space to Fork Space
This section will construct a continuous function to “flaten” the infinite sequences of growing-fork spaces into a static fork space. Intuitively, we will show that the time factor in the growing-fork space can be topologically preserved in the static fork space.
We define a map such that for a set of infinite sequences of forks , is the set of fork graphs of those transactions that incur the first fork, denoted . If the blockchain never spawns a fork, we define the corresponding element in as .
Let be an open set, i.e., an arbitrary union of intersections of fork graphs. Since is a discrete topology, is a subset of , . By definition of , we know there exists a set of transactions who trigger the first forks on every fork element in .
Recall that the topology is induced by the open ball , where and denotes the distance “just” smaller than the distance imposed by a transaction causing the first fork. That is, every singleton element in the basis, i.e., is associated with a transaction . Therefore, we have
By definition, we know is an open set in . From the above condition, is a union of ’s:
Since a union of open sets is an open set, is an open set in .
Therefore, the following morphism holds:
Since both and are continuous, then the composite is also continuous and the following is true:
6 Final Remark
This paper constructs a topology space for the transactions among blockchains, and shows that this space is homeomorphic to another topology space of static fork graphs induced by those transactions. Further, this paper constructs a time-varying counterpart of the static fork graphs, namely the growing-fork topology space, and proves that a continuous function exists from the growing-fork topology to the static forks. Combined together, these results allow us to reason about the cross-blockchain transactions through the growing-fork topology, an intuitive representation of blockchains.
This work is supported by the U.S. Department of Energy (DOE) under contract number DE-SC0020455. This work is also supported by a research award from Amazon and a research award from Google.