1. Introduction
While recent advances of blockchain systems, notably in the form of cryptocurrency, have drawn tremendous interests from both researchers and practitioners (Wang and Wang, 2019; Lind et al., 2019), limited studies existed toward the theoretical foundation of blockchains before in 2017 Herlihy (Herlihy, 2017), for the first time, brought up the connection between blockchains and distributed computing. Admittedly, the original blockchain paper authored by Nakamoto (Bitcoin, Accessed 2020) lacked formal proofs, and yet, Bitcoin, as the banding name of public blockchains, has proven to be a stable production system. As we have witnessed in many other physical sciences, not every discipline started with the rigorous theory before the applications spread out vividly; in fact, it often happened in a reverse way: the internal law was found after people observed many instances for a while. We believe this would be the orbit of the development of blockchains: we have witnessed many successful instances of blockchains, now it might be the time for developing, or extracting, the internal laws of blockchains.
This paper strives to take a rigorous mathematical methodology to better understand blockchains’ fundamental properties. More specifically, we are interested in unraveling blockchain’s algebraic structures—the very underneath commonality among production blockchain systems. We are aware of a parallel study on blockchain theory from a game theory standpoint (e.g., (Tsabary and Eyal, 2018)), which focused on the rationality and equilibrium of blockchain nodes. In contrast, our line of works aims to reveal the intrinsic laws of blockchain states and actions through algebraic structures.
In the remainder of this paper, we will provide a very highlevel introduction to blockchains and algebraic groups. Then, we will define some components that are needed for the axiomatic construction of blockchain groups. After that, we will show that the construction indeed forms a welldefined algebraic group and more importantly, derive some interesting properties that can be potentially considered in realworld blockchain design and analysis. We finally conclude this paper and discuss future research directions.
2. Background
2.1. Blockchkains
A blockchain is a replicated database deployed to a distributed system. By default, each node of the distributed system holds a full copy of the data, usually in a transactional form. Each replica of the database is organized as a hashedlinkedlist of blocks of those transactional data such that the data cannot be compromised unless the entire chain is reconstructed from scratch, which is computationally and financially prohibitive. Since the overall system is replicated, the system employs some consensus protocols for all replicas to agree on.
A rough categorization of blockchains is based on its user membership: if the system is publicly open to everyone, then the blockchain is called permissionless; otherwise, the blockchain is called permissioned. Most of production blockchain systems are permissionless, and they are the emphasis of this paper. At the writing of this paper, the most popular blockchain system Bitcoin, is comprised of about 9,800 nodes (Bitcoin Scale, Accessed 2020). Each of such nodes might be in the form of a coalition of many physical machines, namely a pool (Eyal, 2015).
2.2. Algebraic Groups
The algebraic group is one of the most fundamental abstract algebraic structures—the basis of many more derived structures such as rings and fields. Essentially, a group is a set of elements along with a binary operation defined on these elements: The operation is associative and there exists an inverse for every element such that the product between an element and its inverse is an identify element. A group is usually constructed axiomatically: first, define the set and binary operation and then show that all aforementioned properties are satisfied.
Of of the most interesting groups are called symmetric group, denoted as , whose elements are permutations of the elements in set , and the binary operation is simply the function composition between two permutations. Historically, early grouptheoretical studies focused on the internal structure of because it has a rich set of properties that can be applied to realworld disciplines such as computational chemistry and theoretical physics. We will also leverage in our study on blockchain groups.
3. Definitions
We use to denote the set of all nodes in the blockchain system. We denote the set of all mining pools or clusters by . Of note, represents a dummy set of all the singleton nodes, i.e., those nodes that decide to mine the block individually without joining any pools. In practice, we can assume as most nodes choose to join a pool.
Definition 3.0 (Nodeswitch map ).
We assume that any node can freely^{1}^{1}1Which is true for permissionless blockchains. switch from one pool to another through a nodeswitch map of an arbitrary node from to , , such that:
Note that we do not require for a nodeswitch; the map is well defined even if the node stays in the same pool. We will simply say if it is clear from the context that both and are indices for pools. We use to denote a set of nodeswitches covering all the nodes exactly once, where . Obviously, we have for all ’s. We call a nodeswitch set (of index ).
Definition 3.0 (Poolupdate map ).
When the membership of a specific pool is updated, e.g., a new node joins or an existing node leaves, we denote such an change as a poolupdate, which is, formally, defined as a map .
We define the set of all possible maps ’s (among ’s) as . It should be clear that it is the map between node switch sets ’s, not per se, serving as an element in . As an analogy, in the wellknown symmetric group , it is the permutation between series of numbers, rather than the series itself, being considered as the element of . We then define as a function composition between two ’s among ’s. Obviously, we have . Formally:
Note that is well defined because for any , all elements in appear on some source pools (cf. Def. 3.2, ). If the context is clear, we will simply write to denote the two elements in operate under .
Now we are ready to show that set forms an algebraic group under .
4. GroupTheoretical Internals of Blockchains
4.1. Axiomatic Construction
It boils down to demonstrating the following axioms for to be a group under :

Operation is associative: for any , , and , we have ;

An identity element, denoted as , exists such that for any , we have ;

For any , there exists an inverse counterpart, denoted as , such that .
We will show that all aforementioned axioms hold.
4.1.1. Associativity
Let be any node in the blockchain. Let be any node switch in map . Recall that there are a total of node switches in . By definition of , there must exist one and only one node switch from in : , where is the destination pool . Then by definition of function association, we know is in . Now, without loss of generality, let be a node switch in . By definition of , we have that is a node switch in . Note that both and are arbitrary indices of pools.
Similarly, if we know is a node switch in and be a node switch in , respectively, we then know that be a node switch in . Consequently, if we know is a node switch in , then we have, again, is a node switch in .
We thus have shown that is associative in .
4.1.2. Identity
We construct as a with its elements each of which is a trivial node switch: for all . Obviously, any pool update would be mapped to its original structure after applying , regardless of both left and rightside function composition. It should be noted, again, that each element of is the map over the set of node switches, not the node switches themselves.
4.1.3. Inverse
For an arbitrary , each of the node switches can be written as . Because is the set including all the possible maps between poolupdates, there must exist a unique whose elements can be exactly written in the this form: . Then, for , each node switch follows , comprising ; similarly, for , each node switch follows , again, comprising .
Remark. By construction, the group constructed from and is not commutative, or nonabelian in the literature of group theory. We will denote such a blockchain group as , where and . From the above discussion, we know the order of is .
4.2. Algebraic Properties and Applications
This section presents some important properties implied by the nonabelian blockchain group .
4.2.1. Subgroups, lattices, normal subgroups, and kernels
One of the most notable properties exhibited by lies at its order . We thus can rewrite it as follows:
where where is a prime, for and for any we have . Note that, by this factorization, we have for and , where reads cannot divide . It follows that
which is exactly the form well studied by Sylow’s Theorem: if a group can be written in this form, we know that there must exist a subgroup of order in , where . Since there such primes, we know that has at least subgroups. Consequently, we know that has a nontrivial lattice of subgroups. This result itself could be useful for applications such as cryptography, potentially leading to a new interdisciplinary research area: leveraging blockchain’s internal algebraic structure for encryption.
It would be highly useful if we could know how many of these subgroups are normal, each of which essentially corresponds to a kernel of that is widely used in grouptheoretical applications. However, without instantiating of and , it is not analytically feasible to give the solution, and this is particularly true if or or both is a medium or large number. Nonetheless, we want to point out that for smallorder subgroups, one can leverage Cayley’s Theorem and Corollary: if is the smallest prime that divides , then for any subgroup of , denoted as , if , then is normal, denoted as . Therefore, for small , such as 2 that is very likely included in the prime series of factors of , we can determine whether a subgroup by checking . If so, we will then have a lot of important applications built upon the kernel .
4.2.2. Coset order and element order
According to Lagrange’s Theorem, the number of cosets of a subgroup in , essentially the number of possible translates of from any , can be calculated as . This can be translated into the blockchain network as: if we know an active subset of ’s included in a subgroup , then we can quickly determine exactly the number of (much fewer) possibilities that the node switches can lead to.
Next, we show that some elements (i.e., ’s) have interesting cyclic properties. This is particularly useful by noting that is not cyclic in general. According to Cauchy’s Theorem: if prime number divides , then must have an element of order . Essentially, this means that there are at least elements in , such that each of these elements that would degenerate to the identity element . More formally, we have
Intuitively, this means that some multiplications of poolupdate maps would eventually result in the trivial map—no node switch at all.
4.2.3. Homomorphism to Symmetric Group
We conclude this section with a sketch about the intrinsic relationship between , and the wellstudied group . As a starting point, we want to reemphasize Cayley’s Theorem that states: every group is isomorphic to a subgroup of a symmetric group . That is to say, is structurally identical, up to the operation and a onetoone mapping, to a subgroup of the wellunderstood symmetric group . More specifically, we know that is part of a lattice of the symmetric group . Historically, a subgroup of a symmetric group is also called a permutation group. However, it should be noted that working directly on a is prohibitively costly: the order of is . Using Stirling’s approximation, we have:
If we have a 10node tiny blockchain with two pools, and , the order of the blockchain group is manageable: ; and yet, the corresponding symmetric group has a order of: , which is computationally infeasible. As a side note, the stateoftheart hash function for many production blockchain systems SHA256 takes 512bit inputs and return 256bit outputs; the order of a tinyscale group thus, as we just showed, already hit such a high security level. Therefore, in the following, we provide analytical insights on the relationship between and .
The key correlation between and lies at the structure of pools in the blockchain. Although we differentiate the elements in the set of pools into , what really makes these elements different is its membership of ’s between nodeswitch sets ’s. Specifically, if a map only updates the pool index with the node membership unchanged within each pool, then the new blockchain is essentially a permutation of the original one up to the pool topology. Formally, in any , if for any subset and any we have both and . With this constraint, the blockchain group degenerates to a symmetric group at the granularity of pools: . It should be noted that this result is not applicable to a general blockchain group.
5. Final Remark
This paper presents the first study on the algebraic structure of blockchains with an emphasis on the internal properties under algebraic groups. We axiomatically construct a blockchain group and derive some interesting properties that can be potentially taken into the design space and parametric analysis of realworld blockchain systems. Specifically, we show that (i) a blockchain group, , comprises nontrivial subgroups and lattices that can be possibly leveraged for cryptography; (ii) although is noncyclic in general, there exist cyclic elements in , which can help us reduce space in some scenarios; and (iii) is homomorphic to the wellstudied symmetric group if some constraints hold, thus opening the door to applying the wisdom of to .
Our future work lies in the development of algebraic structures among multiple blockchains. For instance, the crossblockchain transactions (Zhao, 2020) might be analogous to group actions, which can be possibly modeled by the orbits with the conjugate entities from distinct blockchains. As another example, it would be worthwhile to explore the consequence of the primality of the number of nodes, where we might apply some numbertheoretical techniques.
Acknowledgement
This work is in part supported by the U.S. Department of Energy under contract number DESC0020455. This work is also supported by a Google Cloud award and an Amazon research award.
References
 Cited by: §2.1.
 Cited by: §1.
 The miner’s dilemma. In Proceedings of the 2015 IEEE Symposium on Security and Privacy (SP), Cited by: §2.1.
 Blockchains and the future of distributed computing. In Proceedings of the ACM Symposium on Principles of Distributed Computing (PODC), Cited by: §1.
 Teechain: a secure payment network with asynchronous blockchain access. In Proceedings of the 27th ACM Symposium on Operating Systems Principles (SOSP), pp. 63–79. Cited by: §1.
 The gap game. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security (CCS), Cited by: §1.
 Monoxide: scale out blockchain with asynchronized consensus zones. In 16th USENIX Symposium on Networked Systems Design and Implementation (NSDI), Cited by: §1.
 Crossblockchain transactions. In Conference on Innovative Data Systems Research (CIDR), Cited by: §5.