1 Introduction
Nowadays people come to realize the importance of information assets. However, information commodities differ from most entity goods substantially because information commodities lose their values immediately after revelation. Customers can look around a clothing store before making their purchases, while an information store should hide all information from the public, otherwise everyone learns the information and no one will pay for it. However, without the chance to see the goods, how can a buyer decide whether to pay for certain pieces of information as the way she shops in the clothing store?
Moreover, even if a piece of information is disclosed, an information buyer would still be confused, especially when it is unverifiable (e.g. subjective opinions). A costumer shopping her clothes can make a purchase based on the style of the clothing, while the information buyer may have no clue about the value of the disclosed information, which can be meaningless or even fake.
Furthermore, even if there exists a trusted center who is able to magically evaluate the quality of the unverifiable information without revealing it to the buyer, traders are “forced” to authorize this powerful center to access the information. A commonlytrusted information trading center may gradually grow into an information monopolist, which has been observed in the cases of google, Amazon etc. Also, dependency on a trusted center severely limits the trade of the information asset.
Despite several challenges as mentioned above, information trading is prevalent in contemporary world. A running example is that a medical company (the buyer) wants to buy labels (e.g. hard labels like benign/malignant, or soft labels like 90% benign) for multiple difficult medical images with unknown pathological truth from multiple hospitals (the sellers). To tackle the challenges discussed above, we aim to design an information trade protocol for unverifiable information, which satisfies the following three properties:
 Trustfree

a trusted center is not required, and every trader, including the hospitals and the medical company, is rational/strategic rather than required to be honest
 Truthful

1) each hospital (the seller) is incentivized to provide truthful and highquality information, i.e. labels for the medical images; 2) the medical company (the buyer) is incentivized to pay the information with a fixed payment function;
 Secure

except the final trade price, each hospital cannot infer additional information about other hospital’s information. The medical company can only obtain the information if and only if the trade protocol is successfully executed.
A recent line of research, peer prediction (e.g.[25, 33, 14, 21, 34, 22]), focuses on designing mechanisms that elicit truthful, but unverifiable information. Peer prediction mainly measures each seller’s information quality based on her peer’s reported information and pays for the measured quality. In our running example of two hospitals A and B, the quality of their information is measured according to the “similarity” between their labels. The “similarity” measure is carefully designed such that the hospitals will be incentivized to report honestly and a hospital who reports meaningless information (e.g. label randomly or always label benign) will obtain the lowest payment. However, peer prediction does not ensure privacy of participants and implicitly requires a trusted center to compute the “similarity” and to make the exchange of payment and information.
A cryptography protocol, secure multiparty computation (MPC [41, 19, 5, 6])^{1}^{1}1MPC is the abbreviation of secure multiparty computation in standard cryptographic notion, we will use this notion in the subsequent part of this paper., allows us to address the above security issue. MPC is a multiparty protocol such that the process of computation reveals nothing beyond the output from the perspective of protocol participants. Thus, combined with peer prediction, MPC allows sellers to compute the quality of the information without violating privacy of each sellers. However, different from the traditional MPC scenario where a party can be either honest or malicious, in our setting all parties are strategic.
Although MPC achieves privacypreserving computation of information quality, i.e. the payment of the information, the trustfree fair exchange of payment and information still has not been guaranteed. This gives rise to new incentive issues. A recent work [16] proposes a smart contract based solution for the trustfree fair exchange.
We borrow this smart contract based solution and combine with the other two cuttingedge tools, peer prediction, MPC, to propose a trustfree, truthful, and secure information trade protocol for unverifiable information, Smart InfoDealer (SMind). In our running example with SMind which does not require the existence of trusted center, the medical company is able to buy highquality information securely from highquality hospitals at a fair price and a lowquality hospital cannot earn money with poor information or steal other hiqhquality hospitals’ information. Thus, in a new world where the information becomes one of the most important assets in economy, we believe our method will help describe a free and secure information trade scenario.
1.1 Road map
Section 2 discusses related work. Section 3
introduces basic game theory and cryptographic concepts and also the main three building blocks of SMind. Section
4 formally introduces the information trade setting, the protocol design goals and our protocol, SMind. Section 5 shows our main theorem: SMind is trustfree, truthful, and secure. Section 6 shows a natural extension from 2seller setting to multiple sellers. In the end, Section 7 concludes by discussing the robustness and implementation of SMind.2 Related work
Decentralized prediction market
With the bloom of researches on blockchain, decentralized information trading platforms are developed on chain [32, 36, 2, 40], mostly addressing the decentralization of prediction market. However, their settings are totally different from our work. The prediction market is essentially an verifiable information trade platform and assumes that the ground truth will be revealed in the future. Also, our work enables instant payment, while participants in prediction market should wait for the truth to be revealed.
Peer prediction
A recent line of research, peer prediction (e.g.[25, 33, 14, 21, 34, 22]), focuses on designing mechanisms that elicit truthful, but unverifiable information. Unlike prediction market, peer prediction does not assume the existence of ground truth. However, peer prediction does not consider the security issue and implicitly requires a trusted center to compute the payment and to transfer the payment and the information. Our work raises and addresses the security issues of peer prediction.
Outsourced computation
Works on outsourced computation aim to verify the correctness of computation [38, 23, 15]. However, outsource computation addresses mostly on verifiable computation, while our setting considers unverifiable information. Also, their works do not take into account security issues and assume that there is a trusted judge for arbitration, while ours ensures privacy and removes a trusted center from the trading.
Secure Multiparty Computation
The notion of secure multiparty computation is first introduced as an open problem by Yao in 1980s[41], and later there comes many protocols such as [19, 5, 6], etc. Since 2000s building practical systems using generalpurpose multiparty computation becomes realistic due to algorithmic and computing improvements [24, 13, 29, 39]. MPC has been applied to varies of areas, e.g. auction [7, 8]
[27, 26] to address the security and privacy issue. In the current paper, we apply MPC to a new field, eliciting unverifiable information, to address the security issue. Works on strategic MPC introduce the notion of rational protocol design in MPC [17, 20, 3]. Our protocol can also be considered under the strategic MPC setting. By carefully designing economic rewards and punishments, the parties will be motivated to run MPC truthfully. However, we consider an information trade setting which is totally different from previous strategic MPC works.Blockchain and smart contract
Firstly proposed in Bitcoin [28], blockchain is built upon consensus protocols, with security guaranteed by honest majority on chain [18, 35]. Ethereum [9] first realizes smart contract on blockchain. It can be defined as enforced agreements between distrusted parties [12]. In recent works on fairly exchanging verifiable digital goods [37, 16], smart contract works as a trusted third person to verify the incorrectness of trade process when needed. We borrow this idea and combine with peer prediction, MPC to propose a truthful, secure, trustfree protocol for the trade of unverifiable information.
3 Preliminaries
3.1 Basic game theory topics: extensive form, subgame, strategy, equilibrium concepts
Readers can refer to [30] for a detailed definition of basic game theory concepts.
A game consists of a list of players, a description of the players’ possible actions, a specification of what the players know at their turn and a specification of the payoffs of players’ actions.
A node in the extensiveform game defines a subgame if and its successors are in an information set that does not contain nodes that are not successors of . A subgame is the tree structure initiated by such a node and its successors.
A strategy is a complete plan for a player in the game, which describes the actions that the player would take at each of her possible decision points.
We define a strategy profile as a profile of all agents’ strategies . Agents play if for all , agent plays strategy .
A (Bayesian) Nash equilibrium (B(NE)) consists of a strategy profile such that no agent wishes to change her strategy since other strategies will decrease her expected payment, given the strategies of the other agents and the information contained in her prior and her signal.
A strategy profile is a strong Nash equilibrium if it represents a Nash equilibrium in which no coalition, taking the actions of its complements as given, can cooperatively deviate in a way that benefits all of its members.
A strategy profile is a subgame perfect equilibrium (SPE) if it represents a Nash equilibrium of every subgame of the original game.
Definition 3.1 (Strong Subgame Perfect Equilibrium (Strong SPE)).
A strategy profile is a strong subgame perfect equilibrium if it is a SPE and a strong NE.
Backward induction procedure is the process of analyzing a game from the end to the beginning. At each decision node, one strikes from consideration any actions that are dominated, given the terminal nodes that can be reached through the play of the actions identified at successor nodes.
3.2 Cryptographic Building Blocks
3.2.1 Basic cryptographic tools
Definition 3.2 (Encryption Scheme).
A encryption scheme is a tuple that

key generation: having a security parameter , generates a key

encryption: Upon input a message and key , output a ciphertext

decryption: Upon input a ciphertext and key , outputs a message
Definition 3.3 (Commitment Scheme).
A commitment scheme is a twoparty protocol between sender and a receiver. In the first phase, the sender commits to some value and sends the commitment to the receiver; in the second phase, the sender can open the commitment by revealing and some auxiliary information to the receiver, which the receiver will use to verify that the value he received is indeed the value the sender committed to during the first phase.
A commitment scheme is often defined as follows:

[ font=]

Commit: To commit to a message , the sender choose randomness Op^{2}^{2}2The detail of this step depends on concrete construction, we only provide highlevel description here., and generates the commitment by . The sender sends Com to receiver.

Open: To open a commitment, the sender sends and Op to receiver. Receiver computes . If is , then the receiver accepts that this message is indeed the message that the sender previously committed to. Otherwise she rejects.
A commitment scheme should satisfy the following two properties:

Hiding: Given the commitment of a message , the receiver should not learn anything about .

Binding: Given the commitment of a message , the sender should only be able to open the commitment without changing the messages.
3.2.2 Cryptographic protocol
Definition 3.4 (Secure Multiparty Computation).
[11] Secure multiparty computation (MPC) allows distrust parties to jointly compute an agreedupon function of their private inputs without revealing anything beyond the output.
When executing a MPC protocol, parties communicate with each other (receiving messages and sending messages) as well as do some local computation, and they will obtain the output of the agreedupon function after multiple rounds.
Informally, A MPC protocol is secure if participants cannot infer other party’s input through inspecting the list of messages she sees during the execution^{3}^{3}3For simplicity, we omit cryptographic definition details here, and we will discuss them rigorously in appendix.. The formalization of the above intuition is real world/ideal world paradigm [19].
Definition 3.5 (Real World/Ideal World Paradigm).
The security of cryptographic protocol is formalized through the notion of comparison.

Real world: the parties execute a prescribed protocol

Ideal world: there is a trusted agent with complete privacy guarantee, and each party sends their inputs to this trusted agent and the trusted agent computes the agreedupon function, and then sends the output back to each party
With Definition 3.5, a MPC protocol is secure if the scenario where parties executing the protocol is as secure as the scenario where parties has a trusted agent. The formalized notion of this is simulation. We defer the detailed description of simulation to appendix.
Lemma 3.6.
[19] There exists a secure multiparty computation protocol.
3.3 Peer prediction
In the setting where information cannot be verified (e.g. subjective opinions, labels of medical images without known pathological truth), peer prediction focuses on designing reward schemes that help incentivize truthful, high quality information. In the peer prediction style information elicitation mechanisms, each information provider’s reward only depends on her report and her peers’ reports.
Our SMind uses two special peer prediction mechanisms as building blocks that are proposed by Shnayder et al. [34], Kong and Schoenebeck [21, 22]. We believe that using other peer prediction mechanisms as building blocks is similar to our case. The information theoretic idea proposed by Kong and Schoenebeck [21] can give a simple interpretation to the two special peer prediction mechanisms. At a high level, each information provider is paid the unbiased estimator of the “mutual information”, call it mutual information gain (MIG), between her reported information and her peer’s reported information where the “mutual information” is informationmonotone
—any “data processing” on the two random variables will decrease the “mutual information” between them. Thus, to get highest payment, the information provider should tell the truth, since the truthful information contains the most amount of information when applying nontruthful strategy can be seen as a “data processing”. In our running example, a peer prediction style mechanism pays each hospital the “mutual information” between her labels and her peer’s labels.
We start to introduce the two special multitask setting (the number of tasks must be ) peer prediction mechanisms. One is Correlation PP, which elicits discrete signals (e.g. hard label: benign/malignant) and the other is Pearson PP, which elicits forecasts (e.g. soft label: 70% benign, 30% malignant).
CorrelationPP/PearsonPP
Alice and Bob are assigned priori similar questions (e.g. medical image labeling questions).
 Report

for each question , Alice is asked to report and Bob is asked to report . We denote Alice and Bob’s honest report for all questions by and respectively and denote their actual reports by and .
 Payment

Alice and Bob are rewarded by the average “the amount of agreement” between their reports in same task, and punished the average “the amount of agreement” between their reports in distinct tasks. When their reports are discrete signals, they are paid
(1) When their reports are forecasts, given the prior distribution (e.g. the apriori prediction of benign/malignant, like 90% benign/10% malignant), they are paid
(2)
Assumption 3.7 (A priori similar and random order).
All questions have the same prior. All questions appear in a random order, independently drawn for both Alice and Bob.
Alice and Bob play a truthful strategy profile if they report . Informally, Alice and Bob play a permutation strategy profile if they both report the same permutation of their honest reports. (e.g. hard label case: label benign when it’s malignant and say malignant when it’s benign; soft label case: their honest forecasts are 70% benign, 30 % malignant and 65% benign, 35% malignant but they instead report 30% benign, 70 % malignant and 35% benign, 65% malignant). Here we omit the formal definition of permutation strategy profile as well as the introductions of other additional assumptions required by CorrelationPP and PearsonPP since they are not the focus of this paper. Interested readers are referred to Shnayder et al. [34], Kong and Schoenebeck [21, 22]. We present the main property of CorrelationPP/PearsonPP, i.e., both and are maximized if and the maximum is always strictly positive, such that the sellers are willing to participate the game at the beginning.
Lemma 3.8 (CorrelationPP/PearsonPP is truthful).
[34, 21, 22] With a priori similar and random order assumption, and mild conditions on the prior, CorrelationPP/PearsonPP has truthtelling is a strict equilibrium and each agent’s expected payment is strictly maximized when agents tell the truth, where the maximum is also strictly positive. Moreover, when agents play a nonpermutation strategy profile, each agent’s expected payment is strictly less than truthtelling.
3.4 Smart contract
Smart contract enforces the execution of a contract between untrusted parties. It allows credible and irreversible transactions without a trusted third party. The assurance is based on the consensus protocol of blockchain.
In the most prominent smartcontract platform Ethereum, contract codes resides on the blockchain, executed on a decentralized virtual machine Ethereum Virtual Machine (EVM). Each instruction in a smart contract is ideally executed by all miners on the chain. Transaction fees provide economic incentives for miners to execute the contract, who pack transactions into blocks and record them on chain.
It has been shown that without further assumption it is impossible to design protocols that guarantees complete fairness in exchange procedure without a trusted third party [19, 31, 41]. The reliability of blockchain is ensured by honest majority on chain. In our protocol, smart contract provides support for the following functionalities:
 Ledger

A contract with stores ledgerneeded information. It runs with multiple parties, stores their balance and frozen funds in the contract.
 Freeze Funds

It should be able to freeze funds from accounts onto chain.
 Unfreeze Funds

It should be able to unfreeze funds from the previously frozen to accounts in the contract.
For simplicity, we do not elaborate on the functionalities of smart contract in our protocol. Instead, it is taken as a public bulletin board with code running on it.
4 SMind: A Trustfree, Truthful, and Secure Information Trade Protocol
In this section, we introduce our Smart InfoDealer (SMind), a trustfree, truthful, and secure protocol that elicits unverifiable information. We first focus on the information trade setting where there are one buyer vs two sellers and will extend the setting to multiple sellers later.
We recall our example of medical image labeling here and recommend the readers use this running example as a background when they read this section: a medical company (the buyer) wants to buy labels (hard label: benign/malignant, soft label: 90% benign) for multiple difficult medical images with unknown pathological truth from two hospitals (the sellers).
We start by formally introducing the information trade setting and the definition of information trade protocol (Section 4.1). Then we will present three protocol design goals: trustfree, truthful, and secure (see informal definitions in Section 1). Figure 2 shows an overview of SMind. Finally, we will present the pseudosode of SMind (Section 4.3) and show that it is trustfree, truthful, and secure (Section 5).
4.1 Model and setting
Information trade setting
A buyer wants to buy two sellers’ information (denoted as ), for () a priori similar events/questions (e.g. labeling medical images). The opinion format may be a discrete signal (e.g hard label: benign/malignant) in or a forecast (e.g. soft label: 70% benign, 30% malignant) in , where is the set of all possible distributions over , for the possible outcome of the event. We denote the honest private opinions of the two sellers by respectively and their actual reported opinions by respectively.
We call the buyer and sellers traders. Each trader has a privacy cost. Each seller’s privacy cost represents her cost when she knows her private information is revealed to other people besides the buyer who pays. The buyer’s privacy cost represents her cost when her bought information is revealed to other people (e.g.the public) besides the information owner.
Definition 4.1 (Information Trade Protocol (ITP)).
Given a setting , an information trade protocol is a protocol that allows the buyer to buys two sellers’ opinions with a fixed payment function .
4.2 Protocol design goals: trustfree, truthful, and secure
We first give the formal definitions of trustfree and truthful ITPs.
Definition 4.2 (Trustfree ITP).
An ITP is trustfree if its execution does not need to assume that any trader is honest nor the existence of a trusted center.
In a trustfree ITP, the traders are allowed to be rational/strategic instead of required to be honest. To encourage the rational traders to behave honestly, a truthfree ITP should be additionally truthful, which definition will be introduced now.
Traders play truthful strategy in ITP if they follow ITP honestly. At a high level, truthful sellers provide truthful information and truthful buyers pay the information with a fixed payment function. If there exists an equilibrium concept such that the ITP has truthtelling as the only equilibrium satisfying that equilibrium concept, we are convinced to say the traders will be encouraged to follow the ITP honestly, i.e. the ITP is truthful. We pick strong SPE (Definition 3.1) as the equilibrium concept.
Definition 4.3 (Truthful ITP).
An ITP is truthful if it has truthtelling as the only strong SPE.
We give an informal definition of security here and will give a formal real world/ideal world style (see Definition 3.5) definition in appendix.
Definition 4.4 (Secure ITP (informal)).
An ITP is secure if the information is only revealed to its owner and its buyer when traders follow the protocol. Except the output of the payment function, it’s computationally infeasible for other people to obtain additional information, except a negligible probability.
4.3 SMind: description, assumptions, properties
We give the pseudocode of our SMind here (Table 1, 2). In the pseudocode, we use to denote the committed message ’s opening, but this does not mean depends on , instead, it is chosen before generating the commitment of .
We present several reasonable assumptions of our main theorem.
Assumption 4.5.
Initially, sellers do not know each other’s identity.
This assumption guarantees that the sellers cannot privately communicate with each other before the Compute Payment Function stage. We require this assumption to guarantee the truthful property of the peer prediction building block of SMind. Since without this assumption, 1) the sellers will play an order collusion
(e.g. answer yes/no for the questions with even/odd index) to get higher payments; 2) although permutation (e.g. label benign when it is malignant, label malignant when it is benign) cannot bring the sellers strictly higher payments (Lemma
3.8), a permutation strategy profile is much less risky when the above assumption does not hold.The above assumption still allows the sellers to privately communicate with each other in the Compute Payment Function stage, since before this stage, the protocol has already asked the sellers to commit several necessary information securely for future possible Rebuttal stage.
Assumption 4.6.
Traders cannot transfer money after the protocol, aided by a trusted judge outside the protocol.
It may sound possible that the buyer can collude with one seller to cheat for all deposits and divide them evenly after the protocol. However, it implicitly requires a trusted judge to execute this, otherwise buyers will take all money and refuse to give her accomplice.
To encourage the sellers to run MPC rather than calculate the payment in a nonprivate way (e.g. seller 1 sends her private information to seller 2 and seller 2 finishes all computations) in the Compute Payment Function stage, we need the following assumption.
Assumption 4.7.
Both sellers have privacy costs that are greater than the cost of running MPC.
However, this assumption is not necessary if we do not care the computation method the sellers use, since all other parts (e.g. payment submission) in SMind are still truthful and secure without this assumption.
5 Proof of Main Theorem
We recommend the readers to use Figure 2 as a reference when reading the proof. The trustfree property of SMind follows from its description. We show the truthful property and the secure property independently.
To show SMind is truthful, i.e. has truthtelling as the unique strong SPE, we use backward induction procedure (Section 3.1) and start from the last stage. We will firstly show that when deposits are large enough, the rational buyer will raise rebuttal as the protocol states. We then prove that the sellers’ optimal strategy is to compute payment function for their committed answers and report accordingly. Then we show that it is optimal for sellers to package and hash their honest answers , due to the truthful property of the peer prediction style payment functions. Finally to the first stage when questions are assigned, we show that the rational buyer will follow the protocol honestly.
The security of SMind is based on security assumptions of its cryptographic building blocks including encryption, hash, commitment scheme and MPC. We only provide an intuition of proof here. For formal proof of security, readers can refer to Appendix B.
5.1 Truthfulness proof: game theoretic analysis
We start to show that with proper deposits, SMind has truthtelling as the unique strong SPE. We first list the possible costs in SMind: contract cost: ConCost/trader, additional rebuttal cost: RebCost, privacy cost: , MPC cost: MPCCost/seller, (lowerbound) cost of attacking cryptographic building blocks: AttackCost. Note that the AttackCost is too large for a trader to ever try on a real attack.
We identify SPE via a backward induction procedure: start from the last step, transactions or rebuttal.
5.1.1 Transactions or Rebuttal
We first show that when the deposits are sufficiently large, the buyer’s optimal strategy is to follow the protocol in the Open and Check Goods stage.
Definition 5.1 (Incorrect good).
The information good is incorrect if either of the following situations is true:
 infokeys

: the revealed infokeys fail to open the encrypted info
 questions

: the questions set in the opened info is inconsistent with the committed questions set
 payment computation

: the payment of the opened info is inconsistent with the value two sellers submitted.
Lemma 5.2 (Optimal strategy in transaction or rebuttal: ).
There exists proper deposits, in detail,
and
such that after buyer opens and checks the information, it’s optimal for the buyer to raise rebuttal when the good is incorrect and not raise rebuttal when the good is correct.
Proof.
We first claim that that the smart contract is able to open and check the goods with previously committed information, unless the buyer breaks the binding property of the commitment scheme.
Claim 5.3.
In the Rebuttal stage, 1) if the good is incorrect, the buyer will win; 2) if the good is correct, the buyer will lose unless she spends AttackCost to break the binding property of the commitment scheme.
Before the Rebuttal stage, the encrypted information, the key, the questions are all committed to the public (Figure 2). If the good is incorrect, the buyer can share her view with the public by submitting the truthful encrypted information, such that the public can also know the good is incorrect. If the good is correct, the buyer will lose unless she breaks the binding property of the commitment scheme and submits a fake encrypted information. Thus, the above claim is valid.
We start to show that the buyer’s optimal strategy is via the following utility table, Table 3.
No rebuttal will always i) transfer from the buyer to the seller, ii) take the contract cost from the buyer.
If the good is wrong, rebuttal will i) return the buyer her own deposit except the contract cost, ii) bring the buyer all sellers’ deposits except the contract costs, iii) take the buyer the rebuttal cost and her privacy cost.
If the good is correct, if the buyer does not attack the commitment scheme, the rebuttal will i) take the buyer’s deposit and ii) her privacy cost; otherwise the buyer will obtain the rebuttal benefits but lose the large attack cost.
The above table implies a proper deposits exist for the claim since the attack cost is very large. ∎
5.1.2 Compute Payment Function
We move backward to Compute Payment Function stage and show that there exists proper deposits such that it is optimal for the sellers to honestly report , given that are the answers the sellers committed in the previous Submit Answers stage.
Lemma 5.4 (Optimal strategy in Compute Payment Function stage: report ).
Given that the buyer plays rationally in Transaction or Rebuttal stage, there exists proper deposits such that, it is optimal for both of the sellers to report and to reveal keys honestly, given that are the answers the sellers committed in the previous Submit Answers stage.
Moreover, if both the sellers’ privacy costs are greater than the MPC cost, i.e.,
it’s optimal for the sellers to run MPC to calculate the payment function.
Proof.
We first note that although the sellers can communicate with each other in this stage, they have no choice other than to compute the value of their committed data, report the value and reveal the keys honestly, otherwise they will lose either 1) large attack cost (much larger than the highest payment they can obtain in SMind) for breaking the commitment scheme, or 2) all their deposits in rebuttal, given that the buyer plays rationally in Transaction or Rebuttal stage.
Moreover, if both the sellers’ privacy costs are greater than the MPC cost, i.e., , then it’s optimal for the sellers to run MPC to calculate the payment function, otherwise although they may save the MPC cost, at least one of them will lose her privacy cost. ∎
5.1.3 Submit Answers
We move back to the Submit Answers stage and will show that it’s optimal for the sellers to package as their answers and encrypt and report the hash honestly.
Lemma 5.5 (Optimal strategy in Submit Answers stage: package as answers).
In the Submit Answers stage, given that all traders will play rationally in the following stages, it is optimal for both of the sellers to 1) package as their answers and 2) encrypt all info (include questions set and answers) honestly; 3) commit the encrypted info honestly. It’s also optimal for the buyer to check the sellers’ commitments honestly.
Proof.
We start from the last step of this stage, the buyer checks the sellers’ commitments. At this stage, the buyer cannot infer the private information (shown in the security proof). Thus, the rational buyer will check the commitment of encrypted information honestly i.e. the buyer will agree when it is correct (otherwise, the buyer loses the chance to buy the information) and disagree when it is wrong (otherwise, the buyer will lose in the Rebuttal stage).
We move backward to the sellers’ parts: the information package and encrypted information commitment part. Given that all traders will behave rationally later (Lemma 5.2, 5.4), at this stage, the sellers must pick optimal to maximize .
Lemma 3.8 shows that when the buyer assigns questions in a random order, is maximized when and strictly maximized when there exists a permutation, such that is a permutation . Note that Assumption 4.5 guarantees that the sellers cannot communicate privately when they answer the questions and the sellers will prefer answer truthfully if the permutation strategy profile cannot bring them strictly more payments. Moreover, the hiding property of commitment scheme guarantees that the sellers cannot infer each other’s order by public commitments of and .
Thus, it’s optimal for the sellers to pick to maximize their payments . Finally, we show that the sellers should package, encrypt and commit honestly. If the sellers do not package, encrypt and commit honestly, then it will either hurt the buyer or the sellers as SMind is almost a zerosum game for the buyer and the sellers group. Rational seller will not hurt themselves and when it hurts the buyer, the rational buyers will disagree with the seller such that the sellers lose the chance to sell their information. Therefore, the rational sellers should package, encrypt and commit honestly. ∎
5.1.4 Assign Questions
We move back to the initial stage, assign questions and will show that it is optimal for the buyer to follow the protocol honestly here.
Lemma 5.6 (Optimal strategy in Assign Questions stage: truthful strategy).
In the Assign Questions stage, given that all traders will play rationally in the following stages, there exists proper deposits such that it is optimal for the buyer to follow the protocol honestly in this stage.
Proof.
We start from the last step of the Assign Questions stage, the sellers check the correctness of committed questions and . If they ignore the inconsistency between their questions and the commitments, then they will lose their deposits in rebuttal. If they wrongly disagree with the correct commitments, the contract will be rescinded, then they will waste ConCost and lose the chance to sell their information. Therefore, with sufficiently large deposits, the rational sellers will check the committed questions honestly.
When the buyer behave dishonestly in the Assign Question stage, it is possible that 1) the buyer does not commit properly, i.e. apply the incorrect commitment scheme or apply the correct commitment scheme but commit the wrong questions; 2) the buyer does not assign the questions set properly, for instance, not in a random order.
Both of the cases will either hurt the buyer or the seller, since by thinking the sellers as an unit, SMind is almost a zerosum game. Rational buyer will not hurt herself and if it hurts the seller, then the rational seller will disagree for this commitment. Thus it is optimal for the buyer to follow the protocol honestly in the Assign Questions stage. ∎
After the above analysis, We are ready to finish our truthfulness proof. First we can see truthtelling is a SPE, but not a unique SPE since the sellers report the same but wrong value in the Compute Payment Function stage can also consist of a “bad” SPE. However, we note that these “bad” SPEs are not strong NE since the sellers can together deviate to the truthful strategy profile to benefit both of them. Thus, truthtelling is the unique strong SPE here.
5.2 Security proof: cryptographic analysis
In this section, we prove that honestbutcurious participants who follow the protocol cannot learn additional information from the other traders.
 Security against buyer

Security against buyer means that buyer should not learn additional information about seller’s data before the sellers reveal the keys. Note that before revealing the keys, the buyer only has the encryption of the private information and the commitment of the keys. Then, based on the security of the encryption scheme and the hiding property of the commitment, SMind has security against buyer.
 Security against seller

Security against seller means that a seller should not learn additional information about other seller’s data (answers) during the whole protocol. From the security of MPC, a seller cannot learn any additional information of other seller’s input by inspecting the communication transcript (messages being sent and received by a single seller during the MPC). From the hiding property of commitment scheme, a seller cannot infer any additional information of the ciphertext of other seller’s input (after she gets the decryption key, she cannot get other seller’s input anyway).
 Security against public

It means that the public should not learn information except the value of PayFunc submitted to the smart contract. After the keys are revealed, they cannot infer additional information anyway. They can only see the commitment of encrypted data. From the hiding property of commitment scheme, the public does not know the exact value of encrypted data.
6 Extension to multiple sellers
This section introduces a natural extension of SMind to the setting where there are multiple sellers. There are only two main differences. One is the payment function for each seller, and the other is the output of MPC protocol.
We use to denote the set of all sellers’ answers excluding seller . In the information trade setting with multiple sellers, the buyer pays each seller for
Then, in the Compute Payment Function stage, all sellers run MPC protocol to output a payment vector
such that . If all sellers cannot reach an agreement on the payment vector, then their deposits are taken. Once they reach an agreement, they can reach the Transaction or Rebutal stage, like the two sellers’ version.By going through all proofs carefully, this multiple sellers version SMind is still trustfree, truthful, and secure.
Compared with the two sellers version, the multiple sellers version SMind is more desirable in applications, due to the diversity in the payments. If there are three sellers, two highquality, one lowquality, then the lowquality seller will be paid poorly since she has poor correlation with both other sellers, while the highquality seller will be paid fairly since she has high correlation with another highquality seller. Moreover, the lowquality seller cannot get other highquality sellers’ information, due to the security of SMind.
7 Conclusion and Future Work
In an unverifiable information trade scenario, we propose a trustfree, truthful, and secure information trade protocol, SMind, by borrowing three cuttingedge tools that include peer prediction, secure multiparty computation, and smart contract.
A limitation of SMind is the lack of robustness. For simplicity of the game theoretic analysis, we let the sellers play a coordination game in the Compute Payment Function stage. However, if one of the sellers is irrational, this will lead to bad results for all sellers. One future direction is to design the protocol more delicately to make it robust.
Another direct future work is to implement our protocol in smart contract to allow unverifiable information trade in different scenarios, for instance, the data trade in machine learning scenario.
References
 [1]
 Adler et al. [2018] John Adler, Ryan Berryhill, Andreas Veneris, Zissis Poulos, Neil Veira, and Anastasia Kastania. 2018. Astraea: A decentralized blockchain oracle. arXiv preprint arXiv:1808.00528 (2018).
 Asharov et al. [2011] Gilad Asharov, Ran Canetti, and Carmit Hazay. 2011. Towards a game theoretic view of secure computation. In Annual International Conference on the Theory and Applications of Cryptographic Techniques. Springer, 426–445.
 Asharov and Lindell [2011] Gilad Asharov and Yehuda Lindell. 2011. A Full Proof of the BGW Protocol for PerfectlySecure Multiparty Computation.. In Electronic Colloquium on Computational Complexity (ECCC), Vol. 18. 10–1007.

Beaver
et al. [1990]
Donald Beaver, Silvio
Micali, and Phillip Rogaway.
1990.
The round complexity of secure protocols. In
Proceedings of the twentysecond annual ACM symposium on Theory of computing
. ACM, 503–513.  BenOr et al. [1988] Michael BenOr, Shafi Goldwasser, and Avi Wigderson. 1988. Completeness theorems for noncryptographic faulttolerant distributed computation. In Proceedings of the twentieth annual ACM symposium on Theory of computing. ACM, 1–10.
 Bogetoft et al. [2009] Peter Bogetoft, Dan Lund Christensen, Ivan Damgård, Martin Geisler, Thomas Jakobsen, Mikkel Krøigaard, Janus Dam Nielsen, Jesper Buus Nielsen, Kurt Nielsen, Jakob Pagter, et al. 2009. Secure multiparty computation goes live. In International Conference on Financial Cryptography and Data Security. Springer, 325–343.
 Bogetoft et al. [2006] Peter Bogetoft, Ivan Damgård, Thomas Jakobsen, Kurt Nielsen, Jakob Pagter, and Tomas Toft. 2006. A practical implementation of secure auctions based on multiparty integer computation. In International Conference on Financial Cryptography and Data Security. Springer, 142–147.
 Buterin et al. [2014] Vitalik Buterin et al. 2014. A nextgeneration smart contract and decentralized application platform. white paper (2014).
 Camenisch et al. [2018] Jan Camenisch, Manu Drijvers, Tommaso Gagliardoni, Anja Lehmann, and Gregory Neven. 2018. The Wonderful World of Global Random Oracles. In Annual International Conference on the Theory and Applications of Cryptographic Techniques. Springer, 280–312.
 Canetti [2001] Ran Canetti. 2001. Universally composable security: A new paradigm for cryptographic protocols. In Foundations of Computer Science, 2001. Proceedings. 42nd IEEE Symposium on. IEEE, 136–145.
 Clack et al. [2016] Christopher D Clack, Vikram A Bakshi, and Lee Braine. 2016. Smart Contract Templates: essential requirements and design options. arXiv preprint arXiv:1612.04496 (2016).
 Damgård et al. [2012] Ivan Damgård, Valerio Pastro, Nigel Smart, and Sarah Zakarias. 2012. Multiparty computation from somewhat homomorphic encryption. In Advances in Cryptology–CRYPTO 2012. Springer, 643–662.
 Dasgupta and Ghosh [2013] Anirban Dasgupta and Arpita Ghosh. 2013. Crowdsourced judgement elicitation with endogenous proficiency. In Proceedings of the 22nd international conference on World Wide Web. International World Wide Web Conferences Steering Committee, 319–330.
 Dong et al. [2017] Changyu Dong, Yilei Wang, Amjad Aldweesh, Patrick McCorry, and Aad van Moorsel. 2017. Betrayal, distrust, and rationality: Smart countercollusion contracts for verifiable cloud computing. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. ACM, 211–227.
 Dziembowski et al. [2018] Stefan Dziembowski, Lisa Eckey, and Sebastian Faust. 2018. FairSwap: How to fairly exchange digital goods. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security. ACM, 967–984.
 Garay et al. [2013] Juan Garay, Jonathan Katz, Ueli Maurer, Bjorn Tackmann, and Vassilis Zikas. 2013. Rational protocol design: Cryptography against incentivedriven adversaries. In Foundations of Computer Science (FOCS), 2013 IEEE 54th Annual Symposium on. IEEE, 648–657.
 Garay et al. [2015] Juan Garay, Aggelos Kiayias, and Nikos Leonardos. 2015. The bitcoin backbone protocol: Analysis and applications. In Annual International Conference on the Theory and Applications of Cryptographic Techniques. Springer, 281–310.
 Goldreich et al. [1987] Oded Goldreich, Silvio Micali, and Avi Wigderson. 1987. How to play any mental game. In Proceedings of the nineteenth annual ACM symposium on Theory of computing. ACM, 218–229.
 Izmalkov et al. [2005] Sergei Izmalkov, Silvio Micali, and Matt Lepinski. 2005. Rational secure computation and ideal mechanism design. In Foundations of Computer Science, 2005. FOCS 2005. 46th Annual IEEE Symposium on. IEEE, 585–594.
 Kong and Schoenebeck [2016] Y. Kong and G. Schoenebeck. 2016. An Information Theoretic Framework For Designing Information Elicitation Mechanisms That Reward Truthtelling. ArXiv eprints (May 2016). arXiv:cs.GT/1605.01021
 Kong and Schoenebeck [2018] Yuqing Kong and Grant Schoenebeck. 2018. Water from Two Rocks: Maximizing the Mutual Information. In Proceedings of the 2018 ACM Conference on Economics and Computation. ACM, 177–194.
 Lamport et al. [1982] Leslie Lamport, Robert Shostak, and Marshall Pease. 1982. The Byzantine generals problem. ACM Transactions on Programming Languages and Systems (TOPLAS) 4, 3 (1982), 382–401.
 Malkhi et al. [2004] Dahlia Malkhi, Noam Nisan, Benny Pinkas, Yaron Sella, et al. 2004. FairplaySecure TwoParty Computation System.. In USENIX Security Symposium, Vol. 4. San Diego, CA, USA, 9.
 Miller et al. [2005] Nolan Miller, Paul Resnick, and Richard Zeckhauser. 2005. Eliciting informative feedback: The peerprediction method. Management Science 51, 9 (2005), 1359–1373.
 Mohassel and Rindal [2018] Payman Mohassel and Peter Rindal. 2018. ABY 3: a mixed protocol framework for machine learning. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security. ACM, 35–52.
 Mohassel and Zhang [2017] Payman Mohassel and Yupeng Zhang. 2017. SecureML: A system for scalable privacypreserving machine learning. In 2017 38th IEEE Symposium on Security and Privacy (SP). IEEE, 19–38.
 Nakamoto [2008] Satoshi Nakamoto. 2008. Bitcoin: A peertopeer electronic cash system. (2008).
 Nielsen et al. [2012] Jesper Buus Nielsen, Peter Sebastian Nordholt, Claudio Orlandi, and Sai Sheshank Burra. 2012. A new approach to practical activesecure twoparty computation. In Advances in Cryptology–CRYPTO 2012. Springer, 681–700.
 Osborne et al. [2004] Martin J Osborne et al. 2004. An introduction to game theory. Vol. 3. Oxford university press New York.
 Pagnia and Gärtner [1999] Henning Pagnia and Felix C Gärtner. 1999. On the impossibility of fair exchange without a trusted third party. Technical Report. Technical Report TUDBS199902, Darmstadt University of Technology ….
 Peterson et al. [[n. d.]] Jack Peterson, Joseph Krug, Micah Zoltu, Austin K Williams, and Stephanie Alexander. [n. d.]. Augur: a Decentralized Oracle and Prediction Market Platform.
 Prelec [2004] Dražen Prelec. 2004. A Bayesian truth serum for subjective data. science 306, 5695 (2004), 462–466.
 Shnayder et al. [2016] Victor Shnayder, Arpit Agarwal, Rafael Frongillo, and David C Parkes. 2016. Informed truthfulness in multitask peer prediction. In Proceedings of the 2016 ACM Conference on Economics and Computation. ACM, 179–196.
 Sompolinsky and Zohar [2015] Yonatan Sompolinsky and Aviv Zohar. 2015. Secure highrate transaction processing in bitcoin. In International Conference on Financial Cryptography and Data Security. Springer, 507–527.
 Team [2017] Gnosis Team. 2017. GnosisWhitepaper. URL: https://gnosis.pm/assets/pdf/gnosiswhitepaper.pdf (2017).
 Teutsch and Reitwießner [2017] Jason Teutsch and Christian Reitwießner. 2017. A scalable verification solution for blockchains. url: https://people. cs. uchicago. edu/teutsch/papers/truebit pdf (2017).
 Walfish and Blumberg [2015] Michael Walfish and Andrew J Blumberg. 2015. Verifying computations without reexecuting them. Commun. ACM 58, 2 (2015), 74–84.
 Wang et al. [2017] Xiao Wang, Samuel Ranellucci, and Jonathan Katz. 2017. Authenticated garbling and efficient maliciously secure twoparty computation. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. ACM, 21–37.
 Yan and Wong [[n. d.]] Jeff Yan and Brian Wong. [n. d.]. Deaux: A Performant Decentralized Prediction Market. ([n. d.]).
 Yao [1986] Andrew ChiChih Yao. 1986. How to generate and exchange secrets. In Foundations of Computer Science, 1986., 27th Annual Symposium on. IEEE, 162–167.
Appendix A Cryptographic building blocks
a.1 Programmable Global Random Oracle
In our proof, we use a restricted programmable and obeservable global random oracle, a model of a perfect hash function returning uniformly random values. All parties in the protocol can query the global random oracle. Following the works of [10, 16], we are able to use encryption scheme and commitment scheme with this random oracle in our protocol. Figure 4 shows the ideal functionalities of the random oracle.
The programmability of global random oracle provides a strong power for the simulator in proof. It enables the simulator to send the buyer a garbage encryption or a garbage commitment without knowing the real encrypted or committed message and then program the random oracle to decrypt or open the previous garbage encryption or commitment to the real message afterwards when the real message is revealed (see details in [16]). We will employ this property of programmbility several times in the future security proof. Note that the simulator can succeed except with negligible probability.
a.2 Encryption Scheme
We firstly define the security of a encryption scheme (INDCPA secure), and then give a secure encryption algorithm.
Intuitively speaking, INDCPA secure symmetric encryption guarantees that the encryption of two strings is indistinguishable, and further, an adversary (e.g. an eavesdropper in the communication channel between buyer and sellers) cannot distinguish the ciphertext of chosen by her in any case.
Definition A.1.
indistinguishability under chosen ciphertext attack (INDCPA)
Let be a symmetric encryption scheme, and let be an polynomial adversary who has access to an oracle. On input , responds with . We consider the following two experiments:
Then is INDCPA secure encryption scheme if
a.3 Commitment Scheme
Recall the definition of a commitment scheme in section 3.2. Here we provide the commitment scheme used in our proof, contructed with the global random oracle.
Algorithm 3 shows that if is long enough, the chance for to distinguish and is negligible. Thus the hiding property is guaranteed. Algorithm 4 shows how to open the commitment. It is also hard to break the binding property, since
outputs uniformly distributed random numbers.
a.4 Security definition: simulation for semihonest adversaries
Simulation is a way of comparing what happens in the “real world” to what happens in an “ideal world”. The “ideal world” is usually secure by definition. For any adversary in real world, if there exists a simulator in ideal world who can achieve almost the same attack as adversary in real world, then the protocol is said to be secure.
We start to formally define security in the presence of semihonest, “honestbutcurious”, adversaries, i.e. the definition of private [4]. At a high level, a protocol is private if the view of up to corrupted parties in a real protocol execution can be generated by a simulator.
For a protocol , the view of the each party , is defined as her inputs, internal coin tosses and the messages she receives during an execution of a protocol . For each subset of parties , is defined as the union views of all parties in .
Definition A.2 (privacy of party protocols [4]).
Let be a a probabilistic ary functionality that maps inputs to outputs, i.e., and let be a protocol. is private for if for every of cardinality at most , is private for in the sense that there exists a probabilistic polynomialtime algorithm such that for every input it holds that:

Privacy: is computationally indistinguishable with ’s views in protocol given input , i.e., ;

Correctness: when is a deterministic function, equals ’s output , when is a probabilistic function, ’s distribution equals ’s distribution.
is fullprivate for when is private for .
We introduce a common technique, sequential composition, in security proof that will simplify the proof substantially. Informally, it says that if a protocol ’s subprotocols are secure and satisfy a “sequential and isolated” condition, the protocol is secure as well.
Lemma A.3 (Sequential Composition[11]).
Let denote ideal functionality, and let ^{5}^{5}5 denotes a protocol which calls ideal functionality during its execution. denote the real world protocol such that 1) subprotocol are called sequentially, 2) no messages sent while are executed. If is a secure hybrid world protocol computing and each securely computes , then is a secure realworld protocol computing .
Appendix B Formal security proof
We present our formal security proof here. Note that SMind motivates all traders to behave honestly. Thus, we only define and prove the security against semihonest, i.e., “honestbutcurious”, adversaries. We will first give the definition of an ideal ITP which is secure by definition and then show that there exists simulators that can simulate SMind only with views from the ideal ITP, which implies that SMind is secure, as SMind reveals no more information than the ideal ITP.
For simplicity of the proof, we employ the sequential composition technique here and first consider a middle protocol . We define protocol as the protocol that is the same as SMind except the Compute Payment Function stage. In ’s Compute Payment Function stage, the sellers submit their infos to a trusted center and finishes the payment computation privately and reveals the output publicly. Thus, replaces the MPC session in SMind by a trusted center. We will show that is computationally indistinguishable with the ideal world. Then according to Lemma A.3 and Lemma 3.6, SMind is also computationally indistinguishable with the ideal world.
Definition B.1 (Ideal ITP).
An ideal ITP’s functionality:
 Sign Contract

Upon receiving deposits and the consistent payment functions from the traders, reveals the contract publicly.
 Assign Questions

Upon receiving question set from buyer, generates two random orders and sends to each seller privately.
 Submit Answers

Waits for the sellers to finish the questions and collects each seller’s info privately.
 Compute Payment Function

Upon receiving info from two sellers, computes PayFunc, and reveals the output to public.
 Make Transactions

Sends infos to the buyer and makes transactions based on the contract.
Definition B.2 (Security ITP (formal)).
An ITP is secure if the ITP fulfills the ideal ITP’s functionality, and in every stage, ITP is fullprivate for ideal ITP.
We start to show the middle protocol is secure by enumerating all possible subsets and show that is private for ideal ITP in every stage.
Notations
We use the subscript of protocol to denote the stages is in. For instance, represents the protocol execution from Sign Contract stage to Compute Payment Function stage. When the real message is , we sometimes use to denote the simulated . For instance, we use to denote the simulation of . Although has , is usually a random string that is generated independent of , i.e., without the knowledge of .
In the simulation, we assume the simulator in the ideal world can simulate the smart contract internally. The simulation for smart contract is very similar with [16] and we omit the detailed description of the simulation in our proof.
b.1 Security against the smart contract/public: private
Before the Compute Payment Function stage, the public’s view in is
However, in ideal world, a smart contract neither sends nor receives message from the ideal functionality.
At a high level, the simulator can replace by garbages, generate keys, openings and then simulate the above view accordingly. In detail, The simulator can simulate the view by generating random and then outputting
to simulate respectively, where denotes the simulated commitment for question set. But in fact, is a random string that is generated without the knowledge of . Since both our commitment scheme is constructed based on the global random oracle, the output of the simulator is computationally indistinguishable with the real view, only except a negligible probability, according to the property of the global random oracle.
In the Compute Payment Function stage, the public additionally views the output of the payment function while the simulator is also given this output in the ideal world.
After the Open and Check Goods stage, the public in additionally has
The simulator can reveal the previously used
without being distinguished since it is consistent with the simulator’s previous output.
b.2 Security against the buyer: private
Before the Open and Check Goods stage, the buyer’s view in is
We start to construct the simulator to simulate the buyer’s view with only the input message and security parameter.
will first generates two random orders and four randomness
as openings. Note here we do not need to know the message to generate its opening since we just use to denote the opening for ’s commitment and in fact, the randomness is independent of .
Then generates four random strings , . The four random strings are generated independently and uniform at random without any knowledge of infos or infokeys. We will use the programmability of our global random oracle to program these four random strings such that they will correspond to future revealed real infokeys and infos (Section A.1). Since both our commitment scheme and encryption scheme are constructed based on the global random oracle, the output of is computationally indistinguishable with , only except a negligible probability, according to the property of the global random oracle.
After the Open and Check Goods stage, the buyer’s view in additionally has
In the ideal world, the simulator will know infos in this stage. To simulate the view in , we construct infokeys and employ the the programmability of the global random oracle (Section A.1) such that the previously generated garbage can be decrypted to real infos and the previously committed garbage can be opened as , with openings . Then the simulator outputs . Based on the construction of the outputs, these outputs are indistinguishable with the buyer’s real view in since both of them are consistent with the previous views in their worlds respectively.
b.3 Security against one seller: private
Before the Compute Payment Function stage, each seller ’s view is
In this stage, is given . The simulator will simulate the view by using the given inputs and additionally generating several independent random strings to simulate other information (we still add wide hat to the real views to denote the simulated ones), which is computationally indistinguishable with the real world view only except a negligible probability, according to the property of the global random oracle.
In the Compute Payment Function stage, the seller will additionally view the output of the payment function, which is also viewed by the simulator in the ideal world. Thus, the seller’s view can still be simulated here.
After the Open and Check Goods stage, the seller needs to submit