Blockchain  has symbolized a thrilling breakthrough that can bootstrap confidence among distrustful parties on an Internet scale. It is essentially a linear chain of blocks, with each being appended in chronological order via consensus. The consensus, therefore, characterizes the sequentiality of blocks from the temporal scale, standing as the main pillar for system security. Among the existing consensus protocols, PoW dominates the current blockchain-based systems mostly, where the participants (called miners) struggle to figure out computational puzzles by brute-force calculation (called mining), in order to get the accounting right and obtain the reward. Despite its security and robustness, PoW is often criticized for its uneconomical feature, thus striking hot debates on whether other mechanisms are available to function better than PoW.
Pioneer countermeasures are the energy-conservation and energy-recycling alternatives. In detail, the former mechanisms dwindle energy consumption through exerting other selection criteria, such as stakes (proof of stake ) and votes (delegated proof of stake ). Unfairness is the main drawback of these schemes because the intrinsic capacity bias may cause the phenomenon of the rich get richer [4, 5]. This alarms us that sacrificing expensive resources to achieve a fair and secure consensus is necessary and inevitable, thus inspiring the energy-recycling mechanisms, which repurpose the computing power investments into other useful works.
, and machine learning, . In this paper, we propose proof of user similarity (PoUS), where we rearrange the computing power to calculate the similarities of users who issue transactions in blockchain, and enact the calculation results into the packing rule. Besides our PoUS can inherit the essence of energy-recycling consensus mechanisms, it is advocated from the following aspects:
First, PoUS reinvests the valuable computing power back to blockchain instead of contributing to other fields. Motivated by keeping the goodies within the family, the inherent demand of blockchain can be deeply explored, which may energize a more prosperous blockchain.
Second, the similarity-in-design PoUS facilitates cohort analysis of users , which can disclose the population distribution and behavior patterns of users spatially and pave the way for investigating user life cycles and value retentions. In doing so, the consensus in blockchain can be complementarily replenished besides the temporal scale from a fresh angle of spatial dimension.
Third, transaction packaging based on user similarity enables a searchable blockchain. Equipped with powerful storage and index technologies, an effective query and retrieval database based on blockchain can be realized.
Despite the above merits of PoUS, we encounter the following three challenges when fulfilling it: 1) supply-demand contradiction. In fact, the user similarity calculation will exhaust a large amount of computing and storage resources, making it impractical for a single miner to carry out the large-scale calculation. However, a fair and secure consensus mechanism calls for involvement jointly; 2) plagiarism risk. Since no explicit standard exists to measure the calculation result of user similarity, we resort to the voting mechanism to select the most qualified one. However, the transmission of the calculation result of each candidate may provoke plagiarizers who pretend to vote while copying, ruining the fairness of consensus; 3) lying risk. Despite democracy, the voting mechanism can be easily destroyed by untruthful reports since the real intentions of voters are private information that can be easily hidden. This may lead to interest-oriented liars, who grant their polls to the uncertified candidates, endangering blockchain consequently.
Our work intends to resolve the above challenges to make PoUS practically applicable. To the best of our knowledge, our paper is the first work to employ user similarity calculation as the proof of work for consensus in blockchain. As such, a general framework of PoUS is introduced, where the above challenges are addressed as follows:
To resolve the first challenge, we embrace the best-effort schema for design. Concretely, our PoUS allows miners to participate in consensus via calculating user similarity partially. That is to say, miners are only required to dedicate their resources within their capabilities. As long as the majority of miners operate honestly, large-scale user similarity calculation can be achieved.
As for the second challenge, a voting mechanism based on two-parties computation (2PC) is presented to leverage the cryptographic primitives to assure correct voting without disclosing any private information of the candidates. By doing this, the plagiarism risk can be well-repelled since the authentic calculation results of each candidate are masked, and more significantly, the accuracy of voting will not be sacrificed.
The third challenge can be well addressed through a Bayesian truth serum-based incentive mechanism. This mechanism can reward a truth-teller more than a liar, thus encouraging interest-driven voters to honestly report their real beliefs. Under this case, the lying risk can be decreased, which facilitates a democratic and secure voting mechanism accordingly.
We implement a prototype for PoUS in Python based on BlockSim111https://github.com/maher243/BlockSim, and compare its performance with PoW. Through extensive simulations, we can conclude that PoUS outperforms PoW in achieving an average TPS improvement of 24.01% and an average confirmation latency reduction of 43.64%. Besides, PoUS functions well in reflecting the spatial information of users with negligible computation time and communication cost, distinguishing it as the spatial measurer of blockchain.
The remaining part of our paper is organized as follows. We first introduce PoUS from the top level in Section II, which proceeds sequentially from mining, voting to packing. The first stage, i.e., user similarity calculation-based mining, is detailedly described in Section III. After that, the second stage, i.e., the plagiarism- and lying-proof voting mechanism based on 2PC and Bayesian truth serum, is presented in Section IV. The third stage is demonstrated in Section V, which expresses the clustering-based packing scheme. We carry out theoretical analysis and develop a prototype in Section VI to give a thorough evaluation of PoUS. Some questions about PoUS are raised and answered in Section VII. Section VIII summarizes the related work and finally, Section IX concludes our paper.
Ii The Overview of PoUS
At a high level, PoUS involves three stages to reach consensus, which are mining, voting and packing. As shown in Fig. 1, these three stages are proceeded sequentially, with each serving for different purposes. To illustrate:
① Mining: We denote the sets of users and miners as and , where and respectively represent the number of users and miners. In the mining stage, every miner conducts user similarity calculation according to its view of current transaction data, and accordingly, obtains the user similarity matrix , in which each element represents the similarity between the row and column users and . In particular, the only way for a success-hungry miner to be nominated as the leader222By leader, we mean the one who gets the accounting right currently. is to compute its as accurately as possible, which requires extremely huge storage and computing resources. Hence, such a calculation process can be deemed as the mining period as PoW.
② Voting: After computing s, we resort to the voting mechanism to pick up the most qualified miner as the leader. However, a naive voting mechanism may suffer from the following malfeasances:
plagiarism. Intuitively, the candidates should send their s to other voters, based on which, the voters can compare to choose the most competent one. However, publishing s in the form of clear text may incur plagiarists who embezzle others’ computing results. To concur with this, we design a 2PC-based voting mechanism empowered by Garbled Circuits (GC)  and Oblivious Transfer (OT) protocol , which allows miners to vote without knowing the plain s of the candidates, wiping out plagiarists consequently.
lying. Although the voting ecology seems democratic, it can be easily ruined by untruthful elections since the real intentions of voters are private information that can be easily hidden. Considering this, we adopt the Bayesian truth serum-based incentive mechanism to elicit truthful votes among miners, which teaches the voters that only if it reports honestly, can it obtain the highest payoff.
When all of the miners cast their true votes (no lying) based on their calculated s (no plagiarism) according to our voting mechanism, they are required to submit their votes to a vote-counting committee by running a smart contract. After that, the committee can achieve consensus on electing the highest-voted miner as the leader.
③ Packing: In the last stage of PoUS, the leader first clusters transactions with the best global user similarity, and then bails them into a block according to their priorities, which consider user similarity, transaction fee and waiting time. In light of this, PoUS can portray user distribution from the spatial angle without compromising system performance.
Fig. 2 demonstrates the above consensus process more elaborately from the miner’s perspective, where the mining, voting and packing periods are respectively depicted as the blue, green and red boxes. To begin with, the miner should determine whether it gets the packing right for the current round or not. If yes, it needs to generate a new block by packing transactions based on their priorities and then broadcasting it to the committee for verification, before starting a new round of mining. Otherwise, it can mine directly via calculating its until the time for mining is over. The voting process begins with miners propagating the encrypted keys corresponding to their s as well as GCs to each other. When the miner receives the keys and GCs from others, it will then run the OT protocol for completing voting. The above process will continue until the voting time is over, after which the voting results will be sent to the vote-counting committee via running a smart contract. If the miner is nominated as the current leader, it will be informed by the committee, which will also receive the global best user similarity results. This indicates a new round of mining may start right away.
Iii User Similarity Calculation-Based Mining
The mining stage based on user similarity calculation will be interpreted detailedly as follows. The miners start to mine when they find out they are not the leaders for the current round, or when they have completed the job of generating a new block if they are the leaders at present. As stated above, the main task of mining is to calculate user similarity accurately. To achieve this, miners should access extensive user data, and convert data into user vectors with well-characterized interests and preferences. After that, miners can conduct similarity measurements between these vectors, which finally leads to.
1) User data acquisition. The accessibility of massive user data is the cornerstone for accurate user similarity calculation. PoUS sets miners to mine based on both the historical and latest transaction data, where the former denotes the data in the latest blocks of blockchain and the latter represents transactions in the mempool of each miner. These two kinds of data are chosen to span the users’ interests and preferences both previously and currently, which can reflect users’ behaviors more comprehensively. It is worth noting that different miners may perceive different views of user data, which is mainly caused by network delay. However, we convince that such differences among miners will not bring in conflicts of mining since as long as the majority of miners in the network receive consistent data, the user similarity matrix recognized by most miners can be achieved through the voting process.
2) User vector construction. In this period, miners need to convert the obtained user data into vectors for similarity calculation. Multiple vectorization methods are available, and in this paper, we present a simple “user-type” one as an illustration, where each element in the vector denotes the amount of data in the corresponding category of that user. For instance, in a financial blockchain system, there may be four types of transactions , and has pieces of data and pieces of data, then we can use a -dimensional tuple to characterize . In this way, a user-type matrix can be obtained with each user’s vector combined. It is worth noting that there is no specific restriction on how to generate user vectors, as long as all the miners act uniformly.
3) Similarity measurement. After constructing user vectors, miners can carry out the similarity measurement through calculating the distance between every pair of them, resulting in the . All the miners should utilize the same similarity measurement scheme no matter which scheme is specifically employed. However, forcing every miner to compute similarities among all users is impractical because their resources are quite limited. Considering this, the best-effort schema in network service is endorsed in PoUS, so that miners are allowed to submit partial calculation results. We claim that as long as the majority of miners execute correctly, large-scale user similarity calculation can be reached. The of miner , , is shown in Fig. 3, where the first row and column represent users, and is the similarity between users and with . Note that for some , there may exist null values because the inadequate capability of computing or storage may cause uncompleted calculation tasks when the mining time ends.
4) User similarity update. Whenever a new transaction arrives, the user vector related to the corresponding users will change, so does the similarity matrix. There are two kinds of methods for updating user similarity, which are the recalculation schemes and incremental updating ones[12, 13]. No matter which kind of updating method is employed, our PoUS mechanism can function correctly as long as all the miners obey it consistently.
Iv Plagiarism- and Lying-Proof Voting Mechanism
In PoUS, there is no measurable standard to select the highest-quality user, so we resort to the voting mechanism. As the miners are allowed to submit partial calculation results, they are also devised to vote partially, which means that miners are only required to focus on the data they have calculated during mining while neglecting others. Take miner (the voter) votes on miner (the candidate) as an example. Suppose only computes similarities among , and , then is merely enforced to compare each valid element in (i.e., , , etc.) with the same indexed ones in . Based on the comparison result, votes or for each similarity calculated by , where (means approval) implies the difference between the results of and is within a threshold and (denotes disapproval or abstention) represents the difference is beyond or quits to vote. Note that the voting result may be a sparse matrix, hence powerful data compression protocols [14, 15] which could greatly enhance storage and transmission costs can be applied if necessary.
As mentioned above, a naive voting mechanism may provide a breeding ground for plagiarists or liars, who copy the published s from others or collude for not telling the true votes. These two malicious behaviors are meant to be completely hindered in the sense that they undermine the main pillars of consensus, which are fairness and security. In the following, we will introduce the 2PC-based voting mechanism and the Bayesian truth serum-based incentive mechanism with each suppressing plagiarists and liars, to fulfill the goal.
Iv-a 2PC-Based Voting Mechanism
This risk of being copied lies in the fact that each candidate needs to broadcast its calculated s to the voters. However, propagating s transparently makes room for vicious stealers since everyone can receive and reuse them. Such an issue can be resolved by the secure two-parties computation (2PC) [16, 10] framework. Essentially, 2PC allows two parties to jointly evaluate the result of the public function without disclosing information about their private input data and . In this way, voters can judge whether the data to be voted is sufficiently close to their own calculated data (i.e., the difference is within ) and complete the voting procedure dispensing with privacy leakage.
At a high level, candidate , the generator, first operates Algorithm 1 to encrypt its function into the garbled one and then sends together with the garbled input value of its input to other voters, who are deemed as the evaluators. To evaluate and vote on , evaluator runs Algorithm 2 which firstly operates the 1-out-2 Oblivious Transfer protocol (OT)  to get the corresponding garbled input value of its private input . Subsequently, performs with and as the inputs, leading to the voting result. Basically, we design function as shown in Fig. 4 , which takes as inputs two user similarity values (i.e., , ) and outputs the result whether the difference between them is less than or not (i.e., 1 or 0). Here, , , and respectively denote the comparator, multiplexer, and subtractor circuits. To be specific, the quantitative comparison result of , can be obtained through , which is then input into the circuits together with , to produce the minuend and subtrahend for the following circuit. After that, by operating the circuit and comparing the subtraction result with , we can derive the distance between two similarities as a result.
Algorithm 1 describes the steps of generating a garbled gate, operated by the generator. Denote and as the inputs and as the output. For all possible values (i.e., 0 and 1) of and , the generator first produces corresponding random values (Lines 1-6) leveraging the function . After that, the generator encrypts the random value of the output (i.e., and ) by utilizing the symmetric encrypt function with the random values of inputs (i.e., , , and , ) as the secret keys (Lines 7-10). Through this, we can obtain the encrypted random values of the outputs , , and , , which are saved in a random order after being permutated by , a random permutation function (Line 11). At last, the garbled gate is returned (Line 12) and we then finish the generating process. Notably, each gate in should be encrypted as described in Algorithm 1 so as to finally reach the garbled circuit . In particular, needs to be generated only once at the very beginning of consensus but the random value mapping to its input is required to be created in every round of voting. Again, since other miners only get the random value instead of the original data, the private data can be well-protected without any information leakage.
As for the circuit evaluator, it assesses the garbled circuit to get the comparison result via running Algorithm 2. Specifically, the evaluator first gets the key mapped to its input value via employing the OT protocol (Line 1). Then, the garbled circuit with and as the inputs is performed, which outputs the comparison result (Line 2). Finally, the evaluator shares with to get the comparison result through the mapping function (Line 3). This indicates the voting result of on is obtained (Line 4), based on which, the voter can determine the voting matrix accordingly.
Iv-B Bayesian Truth Serum-Based Incentive Mechanism
The voting process may trigger liars, who grant their polls to the candidates that are beneficial to them, instead of the ones that are qualified. This egotistic behavior ruins the security of consensus because such a leader is incompetent to calculate similarity accurately. To suppress unfaithful voters, we propose an incentive mechanism based on Bayesian truth serum to elicit truth-telling behaviors . The intuition behind our mechanism is that only through reporting the true beliefs of voting, can the miner gets the optimal revenue. By doing so, as rational miners pursuing personal payoff maximization, there is no motivation for them to hide their real beliefs.
Generally speaking, each miner is required to vote on of , where , and at the same time, specify a prediction of the empirical distribution of the voting results. However, the true belief of , i.e., , is unknown to others, so is the true distribution of the voter predicts, which is denoted as . Basically, we can get the posterior belief from the common prior via according to the Bayes formula and our voting mechanism is based on the assumption: the voter believes that other miners sharing the same vote would make the same inferences about the distribution accordance with itself, which can be described in the following.
if and only if .
For each submitted by , miner is asked for two reports:
Voting report , which indicates supports the -th similarity result of miner () or not ().
Prediction report , which is the prediction of distribution for the -th similarity result of miner being approved.
Assume the number of voters in each round is . That is, all other miners will participate in voting except the candidate itself333Note that the number of voters in each round can be adaptively adjusted, but here we fix it for simplicity.. Hence, we can get the average vote for as . Accordingly, we set with as the entries of the global best user similarity. Besides, we define the number of votes as:
To resist untruthful telling, we score each voter based on its voting and prediction reports. Given the possible real intention and the prediction
regarding the probability of, a strictly proper scoring rule  defined as can be expressed in the binary quadratic manner, which is,
According to the robust Bayesian truth serum mechanism (RBTS) proposed in , for miner , it is required to select a reference miner with and a peer miner with . Then, based on reported by , we can calculate
where . In light of this, the reward that obtains from voting, i.e., , can be described as
The first part of (4) is called the information score, which is affected by the voting report and the second part is termed as the prediction score , changing with the prediction report . Hence, to investigate the reward maximization of (4), we proceed to analyze the best voting and prediction reporting mechanism separately, which finally indicates that the truth-telling strategy is the optimal one. Note that if the rewarding function is strictly proper, the best prediction mechanism is to report truthfully. Hence, in the following, we will only illustrate how our mechanism motivates miners to submit their voting reports truthfully. We abbreviate the voting choice submitted by a miner as , the prediction as , and let as the true belief about the voting result, then the expected information score can be derived as
When the prediction is , the miner’s expected score can be obtained as
Hence, the expected loss of expected information score is
where we recognize as a parameter in the incentive mechanism. To minimize the expected loss, we should set appropriately so as to satisfy the absolute difference between and , i.e., , to reach the minimum. Now we testify that the scoring rule (3) can guarantee the above requirement, thus validating the effectiveness of our mechanism by corroborating that it is incentive compatible , which is summarized in Theorem IV.1.
Suppose each miner forms a posterior belief about the prediction probability of voting as approval based on its real intention of 1 or 0. And it always holds that for all admissible priors. Then we have,
A miner can maximize its expected score by truthfully reporting the voting choice if .
We will prove that when the true voting report is 1, and the posterior is , can minimize the distance between it and , maximizing the expected score consequently. That is to say, has a shorter distance with compared to when the true voting report is 0. In this way, we have
When , since and , we can get .
Otherwise, if , since , we conclude that .
The same procedure can be conducted when the true voting report is 0, thus we omit it for brevity.
Iv-C Vote Counting Mechanism
After incentivizing miners to truthfully report the votes of each candidate, PoUS steps to the vote counting phrase, which is committee-based. To be specific, PoUS first selects a vote-counting committee, denoted as . Particularly, the selection of committee444Note that the committee can be selected at the beginning of each round. In addition, to reduce the impact of committee alteration on PoUS, the members in committee can be adjusted after several rounds. members can follow multiple rules according to the network size. If a large number of miners exist in the system, the random sampling functions are preferred, like follow the satoshi , verifiable random function (VRF) , since the random probability-based mechanisms can cut down communication costs while guaranteeing distribution. Otherwise, adopting the democratic voting schemes  is more favorable. Then, all the miners send their polls to by running a smart contract illustrated in Algorithm 3, regulated by the Voting and Voting-Result-Waiting timers. When the Voting-Result-Waiting timer expires, all the members in start to count the votes and run a specific consensus mechanism locally, like Practical Byzantine Fault Tolerance (PBFT)555Other consensus mechanisms in permissioned blockchain are also feasible. And we omit the specific description of PBFT since it has been well-analyzed [23, 24]., to reach an agreement on who should be the leader in this round and what are the global best user similarities. At last, the committee is responsible to notify the leader and send the best-calculated user similarity results to it. This triggers the packaging stage of the leader subsequently, which will be demonstrated in the following section. Note that after the block is bailed, it is required to be sent to the committee for verification. If passes, the block is then broadcasted to all miners, or otherwise discarded.
V Clustering-Based Packing
Essentially, the packing stage can be divided into two phases: 1) the leader is first required to cluster the transactions in its mempool based on the global best user similarity; and 2) pack transactions into a block according to their priorities. Note that there is no specific restriction on how to cluster, as long as it meets the requirements of the corresponding scenarios. We introduce the transaction priority rule as
where and refer to the waiting time, submitted fee and similarity666In PoUS, is measured by the inverse ratio of the transaction distance between and the cluster center it belongs to. of transaction with as the scaling parameters. Our packing mechanism benefits from the following three aspects:
Enabling cohort analysis. We can find that a higher leads to a higher probability that can be bailed into a block. In doing so, blocks are packaged with transactions from similar users, facilitating PoUS to realize cohort analysis of users and portray them spatially.
Achieving fairness. To make the transactions with lower similarity have a chance to be packaged, the metrics of delay and fee are added. Such a non-single-factor packaging priority makes our mechanism suitable for more transactions with different levels of user similarity.
Solving the cold start problem. Although the new users have lower similarities due to fewer interactions, their transactions can also be packaged into a block in time by lifting fees or waiting for a longer time, fixing the cold start problem as a result. Furthermore, our mechanism trains new users to become patient or generous, which in turn enhances a sustainable system.
In order to represent the number of clusters a block includes, we add a flag field in the block header with keeping all other information in the current block structure, like the hash value of the previous block, the Merkle tree root of the included transactions, etc. The size of flag is bits, where denotes the maximum number of transactions each block can hold. The bit filled with 1 or 0 indicates whether its corresponding transaction stored in the block body is the first one of a cluster or not. Hence, we can say that the first bit of the flag is always 1, and it has at least one 1. In this way, we can count how many clusters the block has through visiting flag in the block header. An example of a block can be illustrated in Fig. 5, where and there are four clusters with 5,3,2,1 transaction(s) included respectively. Hence, the flag can be then depicted as 10000100101.
In this section, we first present the security analysis for PoUS, which demonstrates its safety and liveness for being a robust and secure consensus mechanism. After that, we develop a prototype of PoUS in Python for experimental evaluation, to verify its performance, functionality and cost.
Vi-a Security Analysis
For security concerns, we prove that PoUS satisfies the safety and liveness properties, whose definitions are given as follows :
Definition VI.1 (Safety)
PoUS can guarantee safety if the honest nodes agree on the same valid block in each round.
Definition VI.2 (Liveness)
PoUS can guarantee liveness if every block proposed by the leader in each round will eventually be committed or aborted.
The proposed PoUS can achieve safety and liveness when there are at least more than fraction of honest nodes in the vote-counting committee .
Given more than honest nodes in , each member in the committee can reach consistency on who is the leader and whether the block proposed by the leader is legal or not due to the adoption of PBFT consensus mechanism. Based on this, PoUS can realize both safety and liveness since other nodes not in the committee are only required to accept the result, i.e., the legal block.
Vi-B Experimental Results
We build a prototype of PoUS in Python based on BlockSim, where the whole process including mining, voting, and packing is plugged. We compare the obtained results with those of PoW to show the superiority of PoUS. All the experiments are carried out on the machine with Intel Core i7-8700 GPU, 3.20 CPU and 8 RAM, and each simulation run is repeated 100 times to obtain the average value for statistical confidence.
We construct blockchain networks with nodes777In the experiment part, we do not differentiate the identities of “miner” and “user”, and uniformly represent them as “nodes”. That is, a node can serve the system as a miner, or benefit from the system as a user, or hold both characters by switching between different roles under different cases., with each mining power ranging in
randomly. Besides, the transmission delay between each pair of nodes, the number of transactions each node produces, and the corresponding transaction fees all follow normal distribution parameterized with mean
and variance. Additionally, we categorize transactions into 6 classes, denoted as , respectively. Each class represents a kind of transaction, such as money transfer, smart contract creation, smart contract invocation, etc888 Different blockchain-based systems may possess different classification standards, which will not be discussed in this paper.. In the stage of mining, each node first collects transactions from its mempool and the latest block (i.e., ), and then counts the number of transactions for each type generated by each user , which are represented as to . Accordingly, we can assemble them as the user vector for calculating user similarity by exerting Euclidean distance with the recalculation update scheme. As for the voting stage, the garbled function is realized as described in Section IV-A. We employ the hash function to facilitate symmetric encryption with keys distributed at the beginning of system operation. Each user will go through the 2PC-based voting process and be incentivized via the Bayesian truth serum-based mechanism. When the similarity gap is less than , a vote labeled with “1” will be cast, or otherwise, it will be labeled with “0”.
Vi-B3 Performance evaluation
We measure the performance of PoUS via two metrics: 1) transaction per second (TPS) and 2) transaction confirmation latency, with various block sizes and block intervals. After obtaining the results, we compare them with those of PoW. The environmental configuration and experimental parameter settings are shown respectively in Tables I and II.
|Sim Time||10000||s||Simulation running time|
|Transaction Size||250||Byte||Average size of a transaction|
|Transaction Delay||0.5||s||Delay for transmitting a transaction|
|Network Size||301000||Number of nodes|
|Block Size||0.516||MB||Size of a block|
|Block Interval||2001000||s||Block generation time gap|
|Block Delay||0.20.6||s||Delay for transmitting a block|
The metrics are sampled in this range.
Fig. 6 reports the number of confirmed transactions per second (TPS) for PoW and PoUS with varying block size when the block delay and block interval are set as 0.4s and 600s. Subfigures (a) (b), and (c) are conducted under different network sizes , respectively. From them, we can conclude that: 1) when the block size is 1MB, the TPS of PoW is nearly 7, which is exactly in line with the Bitcoin system in practice. This demonstrates that simulating PoW with BlockSim is credible, making it plausible and convincing to evaluate PoUS through BlockSim since they share the same environmental and experimental settings; 2) both the TPS of PoW and PoUS will increase as the rise of block size. Since the larger the block is, the more transactions it can hold, leading to the positive relationship between block size and TPS; 3) PoUS achieves an average TPS improvement of 15.66% compared with PoW.
Fig. 7 plots the TPS of PoW and PoUS on the difference of block interval, as the block size and delay are set as 2MB and 0.4s. We present the results under different network sizes in subfigures (a), (b), and (c), based on which, we find that: 1) block interval exerts a negative effect on TPS for both PoW and PoUS. This is straightforward since the longer the time gap between successful blocks, the fewer blocks will be put on blockchain per unit time, narrowing down the transaction throughput as a result; 2) PoUS surpasses PoW in transaction throughput by 24.01% on average.
We also evaluate the transaction confirmation latency in PoUS and PoW, followed by the predefined transaction priority (8) where . The comparisons are conducted under network size 999Note that extensive simulations with different network sizes are carried out, whose results demonstrate very similar trends, thus we omit to show them to void redundancy.. Fig. 8 shows the relationships between confirmation latency with different block sizes and intervals. From subfigure (a), we can point out that: 1) the confirmation latency dwindles as the increase of block size since a larger block can intake more transactions, reducing the latency as a consequence; 2) when the block size varies, PoUS reduces the latency by an average of 24.14% compared with PoW. From subfigure (b), conclusions can be drawn that: 1) the latency increases with the growth of block interval. This is because larger intervals may shorten the number of confirmed transactions on the chain per unit time, thus bringing about a higher latency; 2) when the block interval changes, the latency of PoUS is less than that of PoW about 43.64% on average. Furthermore, PoUS is interval-insensitive since the growth rate of confirmation latency with interval is linear while that of PoW shows an approximately exponential trend.
To sum up, we conclude the performance of PoUS by presenting the following observations:
Observation 1: PoUS outperforms PoW in achieving an average TPS improvement of 15.66% and 24.01% when varying block size and interval.
Observation 2: PoUS surpasses PoW in dwindling the confirmation latency by an average of 24.14% and 43.64% when changing block size and interval. Moreover, PoUS is interval-insensitive.
Vi-B4 Functionality evaluation
We corroborate that PoUS can build up a detailed profile of users by packaging similar transactions into a block, so as to empower the consensus mechanism in measuring the spatial information of blockchain. To that aim, we need to compare two transaction sets: the first set represents transactions in clusters, which are determined by the global best user similarity. This set can be regarded as the baseline showing similar transactions; the second set contains transactions that the leader actually packages, that is, the selected transactions to be assembled currently. If these two sets are consistent, it means that the leader does package transactions with higher similarity, demonstrating PoUS functions as we wish.
We randomly select one leader as the “target leader” during the simulation process for experimental purposes101010Note that situations where other leaders are chosen demonstrate similar results, hence we omit them for brevity.
, and obtain the clustering result based on the source user of transactions in light of the k-means scheme. To illustrate more explicitly, we employ the Principal Component Analysis (PCA) to project the multidimensional vectors into two dimensions. The results are presented in Fig.9, where the clusters are colored in blue, cyan, and green, and the selected transactions are painted in red. We can conclude that: 1) the red dots are basically concentrated in the middle of each cluster, which reflects that the transactions selected by PoUS are indeed of high user similarity. Hence, PoUS can reveal the spatial information of users in blockchain as we expected; 2) some red dots are around the periphery of each cluster. This suggests that even though some transactions are disadvantaged in similarity, they can also be selected if they are urgent or with high fees, reflecting the fairness of our mechanism.
Accordingly, we summarize the functionality of PoUS by presenting the following observation:
Observation 3: PoUS can function well to portray the spatial information of users as we expected, and the transaction priority rule is fair.
Vi-B5 Computation time and communication cost
In this part, we evaluate the cryptographic computation time and communication cost during the 2PC-based voting process, as depicted in Fig. 10. In detail, Fig. 10 (a) describes the exhausted computation time changing with the data size, which covers the OT execution and circuit evaluation (Eval). From this, we find: 1) the time consumption mainly comes from Eval process and the data size lays a positive effect on it; 2) the computation time is negligible since the encryption time is less than 2s when the data size is 2MB. In addition, we plot the communication cost of Eval on the difference of user number, where the compressed row storage scheme is adopted. As described in Fig. 10 (b), when the number of users in blockchain goes more, no more than 2.5 KByte communication cost will be spent, which is completely acceptable.
Observation 4: The computation time and communication cost that PoUS consume are negligible, making it achievable in practice.
Vii Other Concerns
In this section, we list some concerns that readers may arouse about PoUS and present explanations, to show the opportunity for the wide application of PoUS in practice.
Question 1: PoUS requires a smart contract, how to solve the chicken-egg problem where the smart contract is supposed to be secured by the consensus primarily?
Answer: At the beginning of PoUS, we can exert other consensus mechanisms for permissionless blockchain, such as PoW, PoS, etc, to reach an agreement on the smart contract firstly. After the smart contract has been acknowledged by all the nodes, we then transfer to PoUS consensus.
Question 2: How can PoUS defend against double-spending attack or other computation-based attacks?
Answer: It is worth noting that PoUS is immune to the computation-based attacks since the leaders are selected in advance, followed by packaging similar transactions into a block. Hence, PoUS can reach deterministic finality rather than probabilistic one like PoW, making no room for the dominant nodes to overtake the main chain.
Question 3: What are the advantages of establishing a searchable blockchain based on user similarity?
Answer: Essentially, PoUS can achieve user/transaction classification in the consensus stage, which brings in the following two merits. Firstly, PoUS facilitates a searchable blockchain from the transaction level. Through combining efficient storage and index schemes, it can support diverse querying requirements and greatly enhance the query efficiency. In doing so, accurate decision-making can be realized. To illustrate, for the cold-chain transportation transactions under the COVID-19 pandemic, if a transaction is confirmed to be epidemic-related, other possible risky transactions can be quickly located based on the clustering results of the source and end users (in this case, user similarity is defined as the frequency of transactions), so as to curb the spread of the disease. Secondly, the similarity-in-design consensus can also record the users’ behaviors behind the generated transactions, which in turn facilitates cohort analysis of users. Cohort analysis can reveal the characteristics of users through the fancy data, analyzing the differences among various user groups, according to which, the key factors that affect user retention intentions can be found. In this way, we can quantify user value retention and uncover the effect of system improvement more comprehensively.
Viii Related Work
Recently, many attempts have been devoted to developing greener consensus mechanisms as substitutes for PoW. Basically, there are two lines of improvements from the perspectives of energy-conservation and energy-recycling, where our PoUS belongs to the latter. Hence we focus on investigating the energy-recycling studies as follows.
Initially, the meaningless nonce in PoW is replaced with some mathematic problems, such as prime number , matrix-based issues , etc. Subsequently, more complicated problems designed from reality are presented. Zhang et al.  put forward a resource-efficient mining framework for blockchain, called REM, that utilizes the trusted hardware, i.e., the Intel Software Guard Extensions (SGX), to reinvest the wasted computations for executing useful downloads outsourced by clients. This promotes the idea of Proof of Useful Work (PoUW). However, the heavy reliance of functioning REM on SGX may violate the decentralization nature of blockchain, as stated in . Hence, Lasla et al. in 
proposed to divide time into epochs, with each comprising two consecutive mining rounds. And only the selected runner-ups can join in the second round to be compensated with block reward. In doing so, the number of competing blocks can be greatly reduced, which in turn narrows down the consumed computing energy. In parallel, Duet al.  presented a novel mechanism that exploits PoW mining power to accelerate decentralized machine learning through scheduling tasks among multi-access edge computing servers. Additionally, Qu et al. in  repurposed the computing resource to federated learning, which has a natural fit in terms of the organization structure of pooled mining in blockchain. By doing so, they introduced Proof of Federated Learning (PoFL), together with a reverse game-based data trading mechanism and a privacy-preserving model verification mechanism enhanced by homomorphic encryption and 2PC techniques. Besides, Li et al.  exploited the computation power of miners for biomedical image segmentation, based on which, a segmentation model training that can handle multiple tasks, larger models and training datasets was designed.
A novel energy-recycling consensus mechanism named proof of user similarity (PoUS) is proposed in this paper, where the valuable computing resource is reinvested to calculate the similarity of users. PoUS is designed with three stages, which are mining, voting and packing. Each of them respectively serves for similarity calculation, leader selection, and packaging blocks. To address the supply-demand contradiction, we embrace the best-effort schema to allow the miners to compute user similarities partially. Besides, considering the plagiarism and lying risks rooted in the voting process, we present a 2PC-based voting mechanism and a Bayesian truth serum-based incentive mechanism. The former can leverage the cryptographic primitives to assure correct voting without disclosing any private information about the candidates, while the latter encourages the profit-driven miners to honestly report their true beliefs. As for the packing period, we design a fair and effective transaction priority rule for selection. We testify PoUS by implementing a prototype, whose results demonstrate that PoUS surpasses PoW in achieving an average TPS improvement of 24.01% and an average confirmation latency reduction of 43.64%. Besides, PoUS functions well with negligible computation time and communication cost in mirroring the spatial information of users, which can replenish blockchain besides the temporal scale from the spatial dimension.
This work has been supported by National Key RD Program of China (No. 2019YFB2102600), National Natural Science Foundation of China (No. 62072044), the International Joint Research Project of Faculty of Education, Beijing Normal University, and Engineering Research Center of Intelligent Technology and Educational Application, Ministry of Education.
-  S. Nakamoto, “Bitcoin: A peer-to-peer electronic cash system,” Decentralized Business Review, pp. 21 260–21 268, 2008.
-  S. King and S. Nadal, “Ppcoin: Peer-to-peer crypto-currency with proof-of-stake,” self-published paper, August, vol. 19, no. 1, 2012.
-  “Eos.io technical white paper,” https: //github.com/EOSIO/Documentation/blob/ master/TechnicalWhitePaper.md, 2018.
-  X. Qu, S. Wang, Q. Hu, and X. Cheng, “Proof of federated learning: A novel energy-recycling consensus algorithm,” IEEE Transactions on Parallel and Distributed Systems, vol. 32, no. 8, pp. 2074–2085, 2021.
-  M. Saad, Z. Qin, K. Ren, D. Nyang, and D. Mohaisen, “e-pos: Making proof-of-stake decentralized and fair,” IEEE Transactions on Parallel and Distributed Systems, vol. 32, no. 8, pp. 1961–1973, 2021.
-  K. S. Primecoin, “Cryptocurrency with prime number proof-of-work J,” July 7th, 2013.
-  A. Miller, A. Juels, E. Shi, B. Parno, and J. Katz, “Permacoin: Repurposing bitcoin work for data preservation,” in 2014 IEEE Symposium on Security and Privacy. IEEE, 2014, pp. 475–490.
-  Y. Du, C. Leung, Z. Wang, and V. C. Leung, “Accelerating blockchain-enabled distributed machine learning by proof of useful work,” in 2022 IEEE/ACM 30th International Symposium on Quality of Service (IWQoS). IEEE, 2022, pp. 1–10.
-  N. D. Glenn, Cohort analysis. Sage, 2005, vol. 5.
-  Y. Huang, D. Evans, J. Katz, and L. Malka, “Faster secure Two-Party computation using garbled circuits,” in 20th USENIX Security Symposium (USENIX Security 11), 2011.
-  M. O. Rabin, “How to exchange secrets with oblivious transfer.” IACR Cryptol. ePrint Arch., vol. 2005, no. 187, 2005.
D. Dueck, Affinity propagation: clustering data by passing
. Citeseer, 2009.
-  B. Jeong, J. Lee, and H. Cho, “Improving memory-based collaborative filtering via similarity updating and prediction modulation,” Information Sciences, vol. 180, no. 5, pp. 602–612, 2010.
-  Y. Saad, Iterative methods for sparse linear systems. SIAM, 2003.
-  A. Buluç, J. T. Fineman, M. Frigo, J. R. Gilbert, and C. E. Leiserson, “Parallel sparse matrix-vector and matrix-transpose-vector multiplication using compressed sparse blocks,” in Proceedings of the twenty-first annual symposium on Parallelism in algorithms and architectures, 2009, pp. 233–244.
-  A. C. Yao, “Protocols for secure computations,” in 23rd annual symposium on foundations of computer science (sfcs 1982). IEEE, 1982, pp. 160–164.
-  Y. Luo, X. Jia, S. Fu, and M. Xu, “pride: Privacy-preserving ride matching over road networks for online ride-hailing service,” IEEE Transactions on Information Forensics and Security, vol. 14, no. 7, pp. 1791–1802, 2018.
-  D. Prelec, “A bayesian truth serum for subjective data,” science, vol. 306, no. 5695, pp. 462–466, 2004.
-  R. Selten, “Axiomatic characterization of the quadratic scoring rule,” Experimental Economics, vol. 1, no. 1, pp. 43–61, 1998.
-  J. Witkowski and D. Parkes, “A robust bayesian truth serum for small populations,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 26, no. 1, 2012.
-  A. Kiayias, A. Russell, B. David, and R. Oliynykov, “Ouroboros: A provably secure proof-of-stake blockchain protocol,” in Annual international cryptology conference. Springer, 2017, pp. 357–388.
-  J. Chen and S. Micali, “Algorand: A secure and efficient distributed ledger,” Theoretical Computer Science, vol. 777, pp. 155–183, 2019.
-  H. Huang, X. Peng, J. Zhan, S. Zhang, Y. Lin, Z. Zheng, and S. Guo, “Brokerchain: A cross-shard blockchain protocol for account/balance-based state sharding,” in IEEE INFOCOM, 2022.
-  J. Niu, F. Gai, M. M. Jalalzai, and C. Feng, “On the performance of pipelined hotstuff,” in IEEE INFOCOM 2021-IEEE Conference on Computer Communications. IEEE, 2021, pp. 1–10.
-  Z. Hong, S. Guo, P. Li, and W. Chen, “Pyramid: A layered sharding blockchain system,” in IEEE INFOCOM 2021-IEEE Conference on Computer Communications. IEEE, 2021, pp. 1–10.
-  A. Shoker, “Sustainable blockchain through proof of exercise,” in 2017 IEEE 16th International Symposium on Network Computing and Applications (NCA). IEEE, 2017, pp. 1–9.
-  F. Zhang, I. Eyal, R. Escriva, A. Juels, and R. Van Renesse, “REM:Resource-Efficient mining for blockchains,” in 26th USENIX Security Symposium (USENIX Security 17), 2017, pp. 1427–1444.
-  N. Lasla, L. Al-Sahan, M. Abdallah, and M. Younis, “Green-pow: An energy-efficient blockchain proof-of-work consensus algorithm,” Computer Networks, vol. 214, p. 109118, 2022.
-  B. Li, C. Chenli, X. Xu, T. Jung, and Y. Shi, “Exploiting computation power of blockchain for biomedical image segmentation,” in