Dragoon: Private Decentralized HITs Made Practical

03/23/2020 ∙ by Yuan Lu, et al. ∙ New Jersey Institute of Technology 0

With the rapid popularity of blockchain, decentralized human intelligence tasks (HITs) are proposed to crowdsource human knowledge without relying on vulnerable third-party platforms. However, the inherent limits of blockchain cause decentralized HITs to face a few "new" challenges. For example, the confidentiality of solicited data turns out to be the sine qua non, though it was an arguably dispensable property in the centralized setting. To ensure the "new" requirement of data privacy, existing decentralized HITs use generic zero-knowledge proof frameworks (e.g. SNARK), but scarcely perform well in practice, due to the inherently expensive cost of generality. We present a practical decentralized protocol for HITs, which also achieves the fairness between requesters and workers. At the core of our contributions, we avoid the powerful yet highly-costly generic zk-proof tools and propose a special-purpose scheme to prove the quality of encrypted data. By various non-trivial statement reformations, proving the quality of encrypted data is reduced to efficient verifiable encryption, thus making decentralized HITs practical. Along the way, we rigorously define the ideal functionality of decentralized HITs and then prove the security due to the ideal-real paradigm. We further instantiate our protocol to implement a system called Dragoon, an instance of which is deployed atop Ethereum to facilitate an image annotation task used by ImageNet. Our evaluations demonstrate its practicality: the on-chain handling cost of Dragoon is even less than the handling fee of Amazon's Mechanical Turk for the same ImageNet HIT.



There are no comments yet.


page 1

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Crowdsourcing empowers open collaborations over the Internet. A remarkable case is to gather knowledge by human intelligence tasks (HITs). In a HIT, a requester specifies a few questions and lets some workers answer, such that the requester solicits answers while the workers get paid. Since HITs were firstly minted in Amazon’s MTurk [MTurk]

, they have been widely adopted, say to solicit training datasets for machine learning

[feifei, ng, video]. Notably, ImageNet [Imagenet]

, an impactful deep learning benchmark, was created by thousands of HITs, and laid stepping stones for the deep learning paradigm.

Nevertheless, both academia and industry [ABI13, sdhc, mturkbotpanic, PWC15, mturkgolden, SZ15, imagenet-details, mturkbot, turkopticon, MCN16] realize the broader adoption of HITs is severely impeded in practice, as a result of the serious security concerns of free-riding and false-reporting: (i) on the one hand, HITs suffer from low-quality answers, as misconducting workers or even bots would try to reap rewards without making real efforts [sdhc, mturkbotpanic]; (ii) on the other hand, many real-world practices set forth the idea of allowing the requester to reject low-quality answers [PWC15, mturkgolden, SZ15], but cause quite many requesters in the wild arbitrarily reject answers in order to collect data without paying [turkopticon].

The issues of free-riding and false-reporting become the major obstacles to achieving broader adoption of HITs that are joined by mutually distrustful users [turkopticon], and therefore raise a basic requirement of fairness in HITs, namely, the requester pays a worker, iff the worker puts forth a qualified answer. Many studies [sdhc, mturkbotpanic, PWC15, SZ15, mturkgolden, ABI13, imagenet-details] characterize the purpose and then design proper incentives and payment policies for the needed fairness.

Notwithstanding, most traditional solutions to fairness [sdhc, mturkbotpanic, PWC15, SZ15, mturkgolden, ABI13, imagenet-details] fully trust in a de facto centralized third-party platform to enforce the payment policies for the basic fairness requirement in HITs. Unfortunately, putting trust in a single party turns out to be vulnerable and elusive in practice, as a reflection of tremendous compromises, outages and misfeasance of real-world crowdsourcing platforms [MCN16, turkopticon, WazeDown]. For instance, one of the most popular crowdsourcing platform, MTurk, is biased and allows corrupted requesters to reap data without paying [MCN16, turkopticon]. Worse still, all well-known weaknesses of overtrusted third-parties, such as single-point failure [WazeDown] and tremendous privacy leakage [Apple] remain as serious vulnerabilities in the special case of crowdsourcing. Let alone the third-party platforms impose expensive handling fees, say MTurk charges a handling fee up to 45% of overall incentives [mturkfee].

New challenges in decentralization. Recognizing those drawbacks of centralized crowdsourcing, recent attempts [zebralancer, duan2019aggregating] initiated the decentralized crowdsourcing through the newly emerged blockchain222Remark that we let the blockchain to refer the permissionless blockchain (e.g. Ethereum mainnet) that is open to any Internet node through the paper. technology. Their aim is to “simulate” a virtual platform that is trustful to enforce the carefully designed payment policies, without suffering from the vulnerabilities of fully centralized systems.

However, as shown in [KZZ16, KMS16], decentralization atop open blockchain brought about a few “new” security challenges that can render the incentives of HITs completely ineffective [zebralancer].

Privacy as a basic requirement. In particular, due to the transparency of blockchain [KZZ16, KMS16], once some answers are submitted, any malicious worker can simply copy and re-submit them to earn rewards without making any real efforts, which immediately allows free-riding and cracks the basic fairness of HITs. Namely, the transparent blockchain provides all workers a new choice: running a simple automated script to “copy-and-paste” other answers in the blockchain, which was infeasible in previous centralized systems. More seriously, having the new path to free-riding in mind, rational workers would wait to copy, instead of doing any real efforts. Thus sorta “tragedy of the commons” occurs, and no one will respond with independent answers [hardin1968tragedy, stewart2017crowdsourcing, huberman2009crowdsourcing, david2001tragedy]. That said, the straightforwardly decentralized crowdsourcing arguably loses all basic utilities and fails to gather anything meaningful!

So the privacy becomes indispensable in the decentralized crowdsourcing systems, instead of a bonus property.

State-of-the-art & open problem. To overcome blockchain’s inherent limits, prior art [zebralancer] proposes the general outsource-then-prove framework for private decentralized HITs. It enables the requester to prove the quality of answers that are encrypted to her, without revealing the actual answers. Such the proof becomes the crux to ensure privacy, and deters both false-reporting and free-riding. Now, the blockchain needs to verify proofs, so a feasibility challenge sprouts up, considering the on-chain computational resources are too limited to support any large proof or costly verification.

For above reasons, prior work relies on some generic zero-knowledge proof (zk-proof) framework that is succinct in proof size and efficient for verifying, in particular SNARK333Remark that though the rise of Intel SGX becomes a seemingly enticing alternative of SNARK to go beyond many limits of blockchain by remote attestations [Ekiden], unfortunately, recent Foreshadow attacks [foreshadow] allow the adversary to forge “attestations” by stealing the attestation key hardcoded in any SGX Enclave, which seriously challenges the already heavy assumption of “trusted” hardware, and makes it even more illusive to trust SGX in practice. [qap, BCG13, pinocchio] to reduce the on-chain verification cost.

Nonetheless, generic zk-proofs such as SNARK inevitably inherit low performance for the convenience of achieving generality, causing that prior private decentralized HITs suffer from an unbearable off-chain proving cost and a still significant on-chain verifying expense:

  • Infeasible proving (off-chain). The proving of generic zk-proofs (e.g., SNARK) seems inherently complex, due to the burdensome NP-reduction for generality. In particular, prior study [zebralancer-full] reported 56 GB memory and 2 hours are needed to prove whether an encrypted answer coincides with the majority of all encrypted submissions at a very small scale, e.g., at most eleven answers. Such a performance prevents the previous protocol from being usable by any normal requesters using regular PCs.

  • Costly verification (on-chain). Existing blockchains (e.g. Ethereum) are feasible to verify only few types of generic zk-proofs such as SNARK, whose verification need to compute a dozen of expensive pairings over elliptic curve [qap, BCG13, pinocchio]. So the on-chain verification of these zk-proofs is not only computationally costly, but also financially expensive. Currently in Ethereum, 12 pairings already spend 500k gas [EIP1108], and verifying a SNARK proof costs even more (about half US dollar).

Given the insufficiencies of the state-of-the-art, the following critical problem remains open:

How to design a practical private decentralized HITs protocol for crowdsourcing human knowledge?

Our contributions. To answer the above unresolved problem, we present a practical private decentralized HITs protocol for the major tasks of crowdsourcing human knowledge. In sum, our core technical contributions are three-fold:

  • To achieve the practical protocol for private decentralized HITs, we carefully explore various non-trivial optimizations to avoid the cumbersome generic-purpose zero-knowledge framework, and reduce the protocol to a special-purpose verifiable encryption. As such, we attain concrete improvements by orders of magnitude, regarding both the proving and verification:

    • For proving, our approach is two order of magnitude better than generic zk-proof.444Generic zk-proof refers zk-SNARK in our context, since the only generic zk-proof that can be feasibly supported by existing blockchains is zk-SNARK. In particular, for the same HIT, the proving cost of our protocol is only 50 MB memory and 10 ms running time.

    • For verifying, our result improves upon the generic solution by nearly an order of magnitude. The on-chain cost of verifying a proof for answer quality is reduced to 180k gas in Ethereum (much smaller than verifying SNARK proofs and typically few US cents).

  • We further implement our protocol to instantiate a practical private decentralized crowdsourcing system , which can be deployed atop many real-world blockchains such as Ethereum.

    We use to launch a concrete HIT adopted by ImageNet [imagenet-details] to solicit large-scale image annotations atop Ethereum. To handle the task, attains an on-chain (handling) cost $2 US dollars at the time of writing. In comparison, for the same task, the handling fee of MTurk is at least $4 currently [mturkfee, mturkprice].

    Our result provides an insight that the on-chain handling fee (characterizing the users’ financial expense) in the decentralized setting can approximate or even less than the handling fee charged by centralized platforms. This hints the de facto users can financially benefit from decentralized crowdsourcing, though it is not contradictory to the common belief [Sedgwick] that decentralization is more expensive regarding the overall system’s cost.

  • Along the way, we firstly present the ideal functionality of decentralized HIT. This rigorous security model clearly defines what a secure HIT shall be, and allows us prove security against subtle adversaries in the blockchain, due to simulation-based paradigm.

    In contrast, existing decentralized HITs [zebralancer, zebralancer-full] have quite different property-based definitions on “securities”, which at least causes the lack of well-defined benchmark to compare. Even worse, many of them are flawed, as fail to capture all respects of subtle adversaries in blockchain; say, they allow a corrupted requester to reap data without paying, if being given the standard ability of adversarially re-ordering message deliveries. Differently, our simulation-based security model precisely defines the security against subtle attacks in the blockchain.

Challenges & our techniques. The major challenge of making private decentralized HITs practical is that the blockchain must learn the quality of some encrypted answers, namely, to obtain some properties of what a few ciphertext are encrypting. The state-of-the-art [zebralancer, zebralancer-full] proposed to reduce the problem to generic zk-proofs, by observing the requester can decrypt the answers, and then prove the quality of answers to the blockchain. But such the generic approach causes impractical expenses inherently, because of the underlying heavyweight NP-reduction for generality.

To conquer the above challenge, we conduct a different path that deviates from generic zk-proof frameworks to explore a concretely efficient solution. At the core of our private decentralized HITs protocol, we present a special-purpose non-interactive proof scheme to efficiently attest the quality of encrypted answers. Such the approach gets rid of heavyweight general-purpose zk-proof frameworks and then avoids the inefficiency caused by generality.

Fig. 1: The path to realizing efficient proofs for encrypted answers’ quality.

The ideas behind our efficient proving scheme are a variety of special-purpose optimizations to squeeze performance by removing needless generality, such that we reduce the problem of proving encrypted answers’ quality from generic-purpose zk-proof to particular verifiable encryption. As shown in Fig 1, our core ideas are highlighted as:

  • Abstracting real-world HITs. The first step is to well abstract an incentive widely adopted by real-world HITs, namely, the only one incorporated by Amazon’s MTurk [mturkgolden]. In the incentive, some golden standard challenges (i.e., questions with known answers) [sdhc] are mixed with other questions, so the quality of a worker is due to her performance on the golden standards.555This concrete incentive turns to be powerful, say it can capture most HITs in Amazon’s MTurk (due to the official tutorial [mturkgolden]), and also adopted by the impactful ImageNet [imagenet-details] to create large-scale deep learning benchmark.

    We carefully formulate the problem of proving the quality of encrypted answers for the above concrete incentive. So proving the quality of a worker can be reducible to a well-defined two-party problem, in which the verifier needs to output the performance of the worker on a set of golden standard questions, given only a set of ciphertext answering these golden standards challenges.

    Nevertheless, solving this two-party problem is still challenging, as it needs to compute the property of what a set of ciphertext are encrypting. The generic version of the issue falls into multi-input functional encryption [goldwasser2014multi, boneh2015semantically], which is well known for its hardness, and has no (nearly) practical solution so far. We thus conduct the following optimizations to further reduce the problem.

  • Statement reformation. The major obstacle of removing the generic-purpose cryptographic frameworks is the arithmetic relations (i.e., some relationship unrepresentable in the algebraic domain). So we dedicatedly reform the statement of proving the quality of encrypted answers, to remove all arithmetic relations.

    We reform the statement mainly in two ways. First, we prove the upper bound of each worker’s quality instead of proving the exact number, which is a relaxation in the general cases, but does not scarify any utility in our context where the reward is an increasing function of quality. Second, we realize that given the system’s public knowledge, a tiny and constant portion of each worker’s answer (i.e., the part answering gold standards) is already leaked, since this little portion becomes simulatable by the public knowledge; thus we explicitly relax our goal to leak these “already-leaked” information.

    To sum up, the above reformations allows us to remove needless generality of proving answer quality, so that we can reduce the problem to standard verifiable encryption without giving up securities/utilities.

  • Concretely efficient proving scheme. Following the above optimizations, the problem eventually is reduced to verifiable encryption, which becomes representable in concrete algebraic relations. Along the way, we present a certain variant of verifiable encryption that is concretely tailored for the scenario of HITs where the plaintexts are short, and thus squeeze most performance out of it.

    This completes our special-purpose design to boost private decentralized HITs, practically.

Ii Other related Work

Besides existing private decentralized HITs [zebralancer-full, zebralancer] discussed earlier, here we briefly review some pertinent generic cryptographic frameworks and discuss their insufficiencies in the concrete context of private decentralized crowdsourcing.

Privacy-preserving blockchain. A variety of studies [KMS16, ZNP15, solidus] consider the general framework for privacy-preserving blockchain and smart contract. The approaches are powerful in the sense of their generality, yet are expensive for concrete use-cases in practice. For example, Hawk [KMS16] leverages generic zk-proofs to keep blockchain private, but incurs expensive proving expenses. As such, it is unclear how to leverage these generic frameworks to design concretely efficient protocol for the special-purpose of crowdsourcing [zebralancer].

Fair MPC using blockchain. Decentralized crowdsourcing is a special-purpose fair MPC using blockchain. Kiayias, Zhou and Zikas [KZZ16] consider the generic version of fair MPC in the presence of blockchain, but it is unclear how to adopt their generic protocol in practice without expensively computational costs. Recently, increasing interests focus on special-purpose variants of fair MPC in aid of blockchain. For example, [bentov2017instantaneous, kumaresan2015use, david2018kaleidoscope] consider poker games. But these special-purpose solutions are over-tuned for distinct scenarios and are unclear how to be used for private decentralized crowdsourcing.

Multi-input functional encryption. The core problem of private decentralized crowdsourcing is to let the blockchain learn the quality of encrypted answers, which is straightforwardly reducible to multi-input functional encryption (MIFE) [goldwasser2014multi]. But MIFE relies on indistinguishability obfuscation [goldwasser2014multi] or multi-linear maps [boneh2015semantically], which currently we do not notice how to instantiate under standard cryptographic assumptions.

Iii Preliminaries

Here we briefly review some relevant cryptographic notions. Following convention, we let to denote uniformly sampling and to denote computationally indistinguishable.

Cryptocurrency ledger. The cryptocurrency maintained atop the blockchain instantiates a global bookkeeping ledger (e.g. denoted by ) to deal with “coin” transfers, transparently. It can be called out by an ideal functionality (i.e., a standard model of so-called smart contract [KMS16, KZZ16]) as a subroutine to assist conditional payments. Formally, cryptocurrency can be seen as an ideal functionality interacting with a set of parties and the adversary; it stores the balance for each , and handles the following oracle queries [KMS16, perun]:

  • . On input from an ideal functionality (i.e. a smart contract), check whether and proceed as follows: if the check holds, let and , send to every entity; otherwise, reply with .

  • . On input from an ideal functionality (i.e. a smart contract), check whether and proceed as follows: if that is the case, let and , send to every entity.

Commitment scheme. The commitment scheme is a two-phase protocol among a sender and a receiver. In the commit phase, a sender “hides” a string behind a commitment string with using a blinding , namely, the sender transmits to the receiver. In the reveal phase, the receiver gets and as opening for , and executes to output 0 (reject) or (1) accept. We require computational hiding and computational binding. The former one requires the commitments of any two strings are computationally indistinguishable. The latter one means the receiver would not accept an opening to reveal

, with except negligible probability.

Decisional Diffie-Hellman (DDH). DDH problem is to tell that or , given where and is a generator of a cyclic group of order . The DDH assumption states . We assume DDH assumption holds along with the paper.

Verifiable encryption. We consider the verifiable public key encryption scheme () consisting of a tuple of algorithms .

In short, can set up a pair of encryption-decryption algorithms , where and are public and private keys respectively. We let any is a public key encryption scheme satisfying semantic security. For presentation simplicity, we also let denote the public-secret key pair . Moreover, for any , the algorithm explicitly inputs the private key and the ciphertext , and outputs a message with a proof ; the algorithm explicitly inputs the public key and , and outputs 1/0 to accept/reject the statement that . Beside semantic security, also satisfies the following extra properties:

  • Completeness. , for and ;

  • Soundness. Given any and any ciphertext , any P.P.T. adversary cannot produce a proof fooling to accept that is decrypted to if , with except negligible probability;

  • Zero-knowledge. The proof can be simulated by a P.P.T. simulator on input only public knowledge , and , which ensures the protocol leaks nothing more than the truthness of the statement .

Random oracle. We treat the cryptographic hash function as a global and programmable random oracle [RO], and denote the hash function with through the paper.

Simulation-based paradigm. To formalize and prove security, a real world and an ideal world can be defined and compared: (i) in the real world, there is an actual protocol among the parties, some of which can be corrupted by an adversary ; (ii) in the ideal world, an “imaginary” trusted ideal functionality replaces the protocol and interacts with honest parties and a simulator . We say that securely realizes , if for P.P.T. adversary in the real-world, a P.P.T. simulator in the ideal-world, s.t. the two worlds cannot be distinguished, which means: no P.P.T. distinguisher

can attain non-negligible advantage to distinguish “the joint distribution over the outputs of honest parties and the adversary

in the real world” from “the joint distribution over the outputs of honest parties and the simulator in the ideal world”.

Moreover, we consider static adversary who is only allowed to corrupt some parties before the protocol starts. Protocols proven secure in the real/ideal paradigm can be composed sequentially, due to the transitivity of security reductions [goldreich2009foundations].

The advantage of simulation-based paradigm is that all desired behaviors of the protocol can be precisely described by the ideal functionality. Remarkably, the approach has been widely adopted to analyze decentralized protocols [KMS16, KZZ16, bentov2017instantaneous] to capture subtle adversaries in the decentralized setting.

Iv Formalization of Decentralized Human Intelligent Tasks

This section rigorously defines our security model, by giving the ideal functionality of Human Intelligent Tasks (HITs) that captures the security/utility requirements of the state-of-the-art HITs in reality [feifei, ng, video, Imagenet, ABI13, sdhc, mturkbotpanic, PWC15, mturkgolden, SZ15, imagenet-details, mturkbot, turkopticon, MCN16]. Our security modeling sets forth a clear security goal, that is: the HITs in the real world shall be as “secure” as the HITs in an admissible ideal world.

Reviewing the HITs in reality. Let us briefly review the HITs adopted in reality [feifei, ng, video, Imagenet, ABI13, sdhc, mturkbotpanic, PWC15, mturkgolden, SZ15, imagenet-details, mturkbot, turkopticon, MCN16], before presenting our abstraction of their ideal functionality.

Parties & process flow. There are two explicit roles in a HIT, i.e., the requester and some workers.666There is an implicit registration authority (RA), who is required by real-world crowdsourcing platforms e.g. MTurk to prevent adversary forging a large number of identities (a.k.a. Sybil attackers). In practice, RAs can be instantiated by (i) the platform itself (e.g., MTurk), and (ii) the certificate authority who provides authentication service. Our solution can inherit these established RAs, and we therefore omits such the implicit RAs, with assuming all identities are granted. If the participants are interested in anonymity, anonymous-yet-accountable authentication scheme [zebralancer, anonpass] can be used; however, those are orthogonal techniques out scope of this paper. The requester, uniquely identified by , can post a task to collect a certain amount of answers. In the task, also promises a concrete reward policy. The worker with a unique identifier , submits his answer to expect receive the reward.

Task design. A HIT consists of a sequence of questions denoted by , where each is a multiple choice question and is the number of questions in the task. The answer of each question must lay in a particular pre-specified when is published.

The above HIT design is based on batched choice questions, which follows real-world practices [feifei, ng, video, Imagenet, ABI13, sdhc, mturkbotpanic, PWC15, mturkgolden, SZ15, imagenet-details, mturkbot, turkopticon, MCN16] to remove ambiguity, thus letting workers precisely understand the task. For example, Fei-fei Li et al. [feifei, RL10, imagenet-details] used the technique to create the deep learning benchmark ImageNet, and Andrew Ng et al. [ng] suggested it for language annotations.

Answer quality. The quality of an answer is induced by a function , where is the answer submitted by worker , and is some secret parameters of requester. The output of is denoted by , which is said to be the quality of worker .

The above abstraction captures the quality-based incentive mechanism adopted by real-world HITs in Amazon’s MTurk [SZ15, imagenet-details, mturkgolden, mturkbot]. For example, a task consists of questions, out of which questions are golden-standard questions that are “secretly” mixed. The quality of a worker can be computed, due to her accuracy in the golden-standard questions.

Formally, in the qualify function , the parameter , where represents the randomly chosen indexes of the golden-standard questions, and represents the known answers of the golden-standard questions. Following the real-world practices [SZ15, imagenet-details, mturkgolden, mturkbot], the quality of an answer is:

where is Iverson bracket to convert any logic proposition to 1 if the proposition is true and 0 otherwise.

Defining the decentralized HITs’ functionality. Now we are ready to present our security notion of HITs in the presence of cryptocurrency. We formalize the ideal functionality of HITs (denoted by ) in the -hybrid model as shown in Fig 2. Intuitively, abstracts a special-purpose multi-party secure computation, in which: (i) a requester recruits workers to crowdsource some knowledge, and (ii) each worker gets a payment of from the requester, if submitting an answer meeting the minimal quality standard .

In greater detail, the ideal functionality of HITs immediately implies the following security properties:

The ideal functionality of HIT Given accesses to oracle , the functionality interacts with a requester , a set of workers and adversary . Phase 1: Publish Task [leftmargin=0.3cm] Upon receiving from , leak to , until the beginning of next clock period, proceed with the following delayed executions: send to , if return : store ,    B    , , , and as internal states; initialize , and goto next phase; Phase 2: Collect Answers [leftmargin=0.3cm] Upon receiving from , leak the message to , till receiving from , continue with the delayed executions down below: if , do nothing; else, , send to , leak to , go to phase 3 if . Phase 3: Evaluate Answers [leftmargin=0.3cm] Upon entering this phase, leak all received messages to , until the beginning of next clock period, proceed to run the following delayed executions for each : if receiving from , proceed as: check whether , if that is the case, send to , and leak to all entities including ; if receiving from , proceed as: if , leak to all entities, otherwise send to . else, no message from was received, proceed as: if , send to .

Fig. 2: The (stateful) ideal functionality of coin-aided HIT . The blue text shows is proceeding synchronously as the adversary can delay message deliveries up to next clock period [KMS16, KZZ16]; the brown text means that has to proceed asynchronously as if the adversary can arbitrarily delay messages.
  • Fairness. Our ideal functionality captures a strong notion of fairness, that means: the worker get paid, if and only if s/he puts forth a qualified answer (instead of copying and pasting somewhere else). In greater detail, the requester specifies a sequence of multi-choice questions, which are multi-choice questions having some options in and contain gold-standard challenges.777We explicitly consider that and are small constant in the HITs ideal functionality. Such the modeling follows real-world practices [feifei, ng, video, Imagenet, ABI13, sdhc, mturkbotpanic, PWC15, mturkgolden, SZ15, imagenet-details, mturkbot, turkopticon, MCN16]. In particular, is a small constant in practice, because it represents few options of each multi-choice question in HIT; and is also a small constant, as it represents few gold-standard challenges in a HIT task. For each worker, s/he has to (i) meet a pre-specified quality standard and (ii) submit answers in the range of options, in order to receive the pre-defined payment .

  • Audibility of gold-standards. The choose of golden standards is up to the requester, so it becomes a realistic worry that a malicious requester uses some bogus as the answers of golden standard questions. The ideal functionality aims to abstract the best prior art [MCN16, turkopticon] regarding this issue so far, that means the golden standards become public auditable once the HIT is done. This abstraction “simulates” the ad-hoc reputation systems maintained by the MTurk workers to grade the reputations of the MTurk requesters in reality [MCN16, turkopticon].

  • Confidentiality. It means any worker cannot learn the advantage information during the course of protocol execution. Without the property, workers can copy and paste to free ride, so it is a minimal requirement to ensure the usefulness of decentralized HITs. Our ideal functionality naturally captures the property.

Adversary. We consider probabilistic polynomial-time adversary in the real world. It can corrupt the requester and/or some workers statically, before the real-world protocol begins. The uncorrupted parties are said to be honest. Following the standard blockchain model [KMS16, KZZ16], we also abstract the ability of the real-world adversary to control the communication (between the blockchain and honest parties) as: (i) it follows the synchrony assumption [GKL15, KMS16], namely, we let there is a global clock [GKL15, KMS16], and the adversary can delay any messages sent to the blockchain up to a-priori known time (w.l.o.g., up to the next clock); (ii) the adversary can manipulate the order of so-far-undelivered messages sent to the blockchain, which is known as the “rushing” adversary.

Expressivity of the ideal functionality . Our ideal functionality of HITs is rather expressive, as it not only captures the elegant state-of-the-art of collecting image/language/video annotations [feifei, RL10, imagenet-details, ng, mturkbot, video, SZ15]

, but also reflects the common scenario of crowdsourcing human knowledge. Consider the next example: Alice is running a small startup, and aims to provide a service to visualize the availabilities of street parkings. Unfortunately, at each moment, Alice only knows the availabilities of street parkings at quite few spots, since she cannot afford the cost of monitoring every corner around the city. The little a-priori knowledge of Alice is her “golden standards”, and such information is too little to boost a useful service. So Alice can crowdsource more street parking information from a few workers, with using her few golden standards to control the quality of solicited data.

In light of the above discussion, it is fair to say that our abstraction is expressive to capture most real-world practices of crowdsourcing human knowledge (e.g. HITs in MTurk).

V HITs Protocol and Security Analysis

This section elaborates our practical protocol for decentralized HITs. We begin with an important building block for proving the quality of encrypted answers. Then we showcase the smart contract functionality that interacts with the workers and the requester. Later, the detailed protocol is given in the presence of . We finally prove that our protocol securely realizes the ideal functionality of HITs.

V-A Proof of quality of encrypted answer ()

The core building block of our novel decentralized protocol is to allow the requester efficiently prove the quality of encrypted answers. We formally define this concrete purpose to set forth the notion of , and then present an efficient reduction from it to verifiable encryption ().

Defining . The problem we are addressing here is to prove that: an encrypted answer can be decrypted to obtain some s.t. the quality of is , without leaking anything other than , and the parameters of quality function.

To capture the problem, the state-of-the-art [zebralancer, KMS16] adopts the standard notion of zk-proof in order to support generic quality measurements. Different from existing solutions, we particularly tailor the notion of zk-proof to obtain a fine-tuned notion of for the widely adopted quality function defined in §IV. Namely, we consider where is the index of gold-standards and is the ground truth of golden standards, and aim to remove the unnecessary generality in the concrete setting.

Precisely, given the quality function and any established public key encryption scheme , we can define as a tuple of hereunder algorithms :

  1. . Given the encrypted answer , the quality , and the golden standards , it outputs a proof attesting is the quality of ; the algorithm explicitly takes the secret decryption key as input;

  2. . It outputs 0 (reject) or 1 (accept), according to whether is a valid proof attesting is the actual quality of ; the algorithm explicitly takes the public encryption key as input;

Moreover, shall satisfy the following properties:

  • Completeness. is complete, if for any , , , and s.t. , there is ;

  • “Upper-bound” soundness. is upper-bound sound, if for any , , , and , for P.P.T. , there is , where is a negligible function in ; so it is computationally infeasible to produce a valid proof, if is not the upper bound of the quality of what is encrypting;

  • “Special” zero-knowledge. Conditioned on and the range of elements in are small constants, for any , , , and , a P.P.T. simulator that can simulate the communication scripts of protocol on input only , , , , and .

Rationale behind the finely-tuned abstraction. The notion of is defined to remove needless generality in the special case of HITs. Compared to the state-of-the-art notion [zebralancer], is more promising to be efficiently constructed, as it brings the following definitional advantages:

  • We adopt upper-bound soundness to prove the upper bound of quality instead of proving the exact quality of each worker. Such the tuning stems from a basic fact that: the reward of a worker is an increasing function in quality, so the upper bound of the worker’s quality exactly reflects the well-deserved reward of the worker.

  • Another major difference is the relaxed special zero-knowledge, which means: is zero-knowledge, when and are small, so anything simulatable by the gold standards can be leaked. Nevertheless, the conditions are prevalent in the special context of HITs [feifei, ng, video, Imagenet, ABI13, sdhc, mturkbotpanic, PWC15, mturkgolden, SZ15, imagenet-details, mturkbot, turkopticon, MCN16], because represents few golden standard questions, and the range of represents the options of each multiple-choice question in HITs, thus both of which are small in reality.

In sum, even though is seemingly over-tuned, it essentially coincides with the generic zk-proof of the quality of encrypted answers in the context of HITs.

Construction and security analysis. Here is an efficiency-driven way to constructing with the quality function defined in §IV. We reduce the problem to the standard notion of verifiable encryption. More precisely, if being given that is an established verifiable encryption scheme, for can be constructed as illustrated in Fig 3.

Public: , , , ,

for each in :if :output

for each in :if :output 0if :output 0output

Fig. 3: The construction of for the quality defined in §IV.
Lemma 1.

Given any verifiable encryption , the algorithm in Fig 3 satisfies the definition of regarding the quality function defined in §IV.


(sketch) The completeness is immediate to see once considering the definition of quality function, the correctness of encryption and the completeness of . To prove the upper-bound soundness, we can assume by contradiction to let an adversary break it, then the adversary can immediately be leveraged to break the soundness of , which leads up to contradiction. The special zero-knowledge is also clear to see: considering and the range of are small constants, we can construct a P.P.T. simulator that invokes at most polynomial number of subroutines [hazay2010note] to obtain proofs, thus allowing to internally craft a simulated proof. ∎

The HITs contract functionality Given accesses to , interacts with , , and . Phase 1: Publish Task [leftmargin=0.3cm] Upon receiving from , leak the message and to , until the beginning of next clock, proceed with the delayed executions down below: [leftmargin=0.3cm] send to , if returns : [leftmargin=0.3cm] store ,    B    , , , , and initialize , send to all entities, and goto phase 2-a Phase 2-a: Collect Answers (Commit phase) [leftmargin=0.3cm] Upon receiving from , leak the message and to , then proceed with the following delayed executions until the beginning of next clock, with consulting to re-order all received messages: [leftmargin=0.3cm] for each received message (sent from ): [leftmargin=0.3cm] if and : [leftmargin=0.3cm] let if , send to all entities, and goto the phase Phase 2-b: Collect Answers (Reveal phase) [leftmargin=0.3cm] Upon entering this phase, leak all received messages and their senders to , till the next clock period, proceed as: [leftmargin=0.3cm] for each : [leftmargin=0.3cm] if receiving the message from such that : [leftmargin=0.3cm] else send to all, and goto the next phase Phase 3: Evaluate Answers [leftmargin=0.3cm] Upon entering this phase, leak all received messages and their senders to , till the next clock period, proceed as: [leftmargin=0.3cm] if receiving from , such that : [leftmargin=0.3cm] for each : [leftmargin=0.3cm] if receiving from : [leftmargin=0.3cm] send to , if or else if receiving from : [leftmargin=0.3cm] send to , if or else if , send to otherwise, for each , send to

Fig. 4: The ideal functionality of the (stateful) HITs contract.

V-B HIT contract and HIT protocol

Now we are ready to present our concretely efficient decentralized protocol for HIT. Our design centers around a smart contract , which is formally described in Fig 4. The contract is the crux to take best advantage of the rather limited abilities of blockchain to make our protocol securely realize the ideal functionality . Thus given contract , our HITs protocol can be defined among the requester, the worker and the contract, as formally illustrated in Fig 5. Informally, our HIT protocol proceeds as follows:

  1. Publish task. The requester announces her public key , and publishes a task of multi-choice questions to crowdsource answers for the task. Each question in is specified to have some options in . The task mixes some golden standard questions, whose indexes and ground truth are committed to . Also, places    B    as deposit to cover her budget, which promises that a worker would get a reward of , if submitting an answer beyond a specified quality standard .

  2. Commit answers. Once the task is published, the workers can commit their answers (encrypted to the requester) in the task. To prevent against copy-and-paste attacks, duplicated commitments are rejected. The contract moves to the next phase, once distinct workers commit.

  3. Reveal answers. After workers commit their answers, these workers can start to reveal their answers in form of ciphertexts encrypted to the requester. Note that the submissions of answers explicitly contain two subphases, namely, committing and revealing, which is the crux to prevent the network adversary from taking advantages by adversarially scheduling the order of submissions.

  4. Evaluate answers. Eventually, the requester is supposed to instruct the blockchain to correctly pay these encrypted answers to facilitate the critical fairness. To this end, the protocol leverages our novel notion of . So the requester can efficiently prove to the contract to reject a certain answer, if the worker does not meet the pre-specified quality standard . If an answer is out of the specified , the requester is allowed to use verifiable encryption to reveal that to reject payment.

Remark. captures the essence of smart contracts [Woo14] in reality, as it: (i) reflects the transparency of Turing-complete smart contract that is a stateful program handling pre-specified tasks publicly; (ii) captures a contract that can access the cryptocurrency ledger to honestly deal with conditional payments; (iii) models the network adversary who is consulted to schedule the delivering order of so-far-undelivered messages.

The protocol of HITs is among the requester , the workers and Phase 1: Publish Task [leftmargin=0.3cm] Requester : [leftmargin=0.3cm] Upon receiving the parameters , , , , ,    B    , of a HIT to publish: [leftmargin=0.3cm] send to Phase 2: Collect Answers [leftmargin=0.3cm] Worker : [leftmargin=0.3cm] Upon receiving from : [leftmargin=0.3cm] get the answer , where send to Upon receiving from : [leftmargin=0.3cm] if , send to Phase 3: Evaluate Answers [leftmargin=0.3cm] Requester : [leftmargin=0.3cm] Upon receiving from : [leftmargin=0.3cm] send to for each : [leftmargin=0.3cm] decrypt each item in to get if s.t. : send to else if : send to

Fig. 5: The formal description of the decentralized HITs protocol .

V-C Instantiating cryptographic building blocks

For sake of completeness, we hereafter give the constructions of cryptographic building blocks. Let be a cyclic group of prime order , where is a random generator of .

(Short ) verifiable encryption () is based on exponential ElGamal. The private key , the public key , the encryption is , and the decryption is , where is to brute-force the short plaintext to obtain ; if decryption fails to output , then is returned. In addition, to efficiently augment the above to be verifiable, we adopt a variant of Schnorr protocol [Sch89] with Fiat-Shamir transform in random oracle model. In detail,

  • . Run to obtain (or if ). Let . Compute , , , and . If , output ; else, output .

  • . Parse . If , compute and verify , output 1 if the verification passes and 0 otherwise; else , compute and verify , output 1 iff the verification passes and 0 otherwise.

Proof of quality of encrypted answer () is built by invoking the above construction in a black-box manner, due to our reduction from to in §V-A.

Commitment scheme is due to the efficient folklore construction [RO, fairswap]: (i) ; (ii) , where is Iverson bracket from a proposition to 1 (true) or 0 (false).

V-D Security analysis

Theorem 1.

Conditioned on the hardness of DDH problem and static corruptions, the stand-alone instance of securely realizes in -hybrid, random oracle model.

Proof. (sketch) Let denote the set of corrupted parties controlled by the adversary , and let denote the set of rest honest parties. For any P.P.T. adversary in the real world, we can sketch a P.P.T. simulator in the ideal world to interact with the ideal functionality and corrupted parties, such that can emulate the actions of honest parties and the contract . proceeds as follows:

  • Publish Task (Phase 1). If , considering that the corrupted sends the message to in the real world, can trivially simulate that with interacting with . If , when the honest sends the message to , is informed and thus allows to simulate the phase of publish task (in the real world) for that .

  • Collect Answers (Phase 2). In the real world, the P.P.T. adversary might: (i) corrupt a set of parties up to including the requester and a set of the workers, and (ii) is also consulted to reorder the so-far-undelivered messages sent to (till the next clock).

    The basic strategy to emulate is that: invokes the adversary to obtain how is re-ordering the messages (sent from workers), let to represent the set of workers whose messages are scheduled as the first to deliver; then delays all messages that are not sent from the workers in . Then, internally simulates the ciphertexts sent via messages to open commitments. If

    , the ciphertexts can be simulated as they are indistinguishable from the uniform distribution over the ciphertext space; if

    , is informed about all answer submissions sent from the workers, thus can internally simulate the submissions of the workers in the real world.

    Moreover, if corrupts a worker whose message is scheduled in the first to deliver but does not send any the message to open the commitment, the simulator can simulate that since it can let the corrupted worker to send an message containing with . In addition, it is trivial to see can internally simulate the parties as well as , when the adversary corrupts a worker to submit duplicated commitment.

  • Evaluate Answers (Phase 3). The simulation become clear, if considering the security requirements of commitment scheme, , and . If the requester , the simulator invokes to obtain all and/or messages sent to , and then simulates the interactions. If the requester , whenever sends and/or messages to , is informed and hence is allowed to simulate the interactions between and in the real world.

Vi : Implementation & Evaluation

To demonstrate the feasibility of our protocol, we implement it to build , and then use the system to launch a typical image annotation task for ImangeNet [imagenet-details, RL10] atop Ethereum.

System overview. consists of an on-chain part and an off-chain part: the on-chain smart contract is deployed in Ethereum ropsten network; the requester client and worker clients are implemented in Python 3.6. The off-chain clients are installed in a PC that uses Ubuntu 14.04 LTS and equips Intel Xeon E3-1220V2 CPU and 16 GB main memory.

ImageNet’s HIT task. We demonstrate our system through an ImageNet task [imagenet-details, RL10], which is specified as: each task is made of 106 binary questions, 100 out of which are non-gold-standard questions, while the remaining 6 questions are requester’s gold-standard challenges; 4 workers are allowed to participate; if a worker cannot correctly answer at least four golden standard questions, his submission will be rejected without being paid, otherwise he deserves to get the payment.

Cryptographic modules. The hash function is instantiated by keccak256. We choose the cyclic group by using the subgroup of BN-128 elliptic curve, over which all concrete public key primitives are instantiated.

Code availability. The code of our prototype is available at https://github.com/njit-bc/dragoon. An experiment instance is atop Ethereum ropsten network (https://ropsten.etherscan.io/address/0x5481b096c78c8e09c1bfbf694e934637f7d66698).

Fig. 6: The schematic diagram of at a high-level.

Implementation details. Many non-trivial on- and off-chain optimizations are particularly made for practicability.

Off-chain ends. The requester end warps: (i) an Ethereum node to interact with the blockchain, e.g. publish task, download workers’ submissions, etc; (ii) the prover of verifiable encryption to generate necessary proofs to instruct the contract to reward workers; (iii) a Swarm API to publish the detailed questions of each crowdsourcing task. Swarm [Swarm] is an off-chain storage network, where the questions of HIT is stored; in addition, to ensure integrity of HIT questions, the digest of the questions is committed in the contract, which significantly reduces on-chain cost, without violating securities.

The worker client wraps Ethereum to interact with the blockchain to read task and submit answers, and also incorporates Swarm client to allow download task questions.

On-chain optimizations. We carefully perform a few non-trivial system-level optimizations to lighten the task contract: (i) we implement all public key schemes over subgroup of BN-128 [bncurve], since we can use some precompiled contracts in Ethereum to do algebraic operations there cheaply [EIP1108]; (ii) it is expensive to store ciphertexts in the contract as internal variables, so we make the contract store their 256-bit hashes instead and let the actual ciphertexts included in the chain as emitted event logs [Woo14].

Evaluations. We conduct intensive experiments to measure the concrete performance, and discuss the system feasibilities from the on-chain side and the off-chain side.

Off-chain costs. First, enables the requester to manage only one private-public key pair throughout all her tasks, because all protocol scripts are simulatable without secret key and therefore leak nothing relevant. More importantly, the off-chain cost of proving relevant cryptographic proofs is significantly reduced by removing unnecessary generality.

Statement to Prove Time Peak Memory Proof Size
Ours 3 ms 53 MB 96 B
10 ms 53 MB 288 B
Generic ZKP 37 s 3.9 GB 288 B
112 s 10.3 GB 288 B

* Through our evaluations, generic zk-proofs are instantiated by zk-SNARK, which is the only generic zk-proof feasibly supported by existing blockchains to our knowledge.

TABLE I: Off-chain Proving Cost of and due to our concrete constructions and generic zk-proofs respectively.

Table I clarifies the requester suffers from hindersome off-chain burden of generating generic zk-proofs. In contrast, our concrete constructions remove such bottleneck and boosts decentralized HITs practically. First, the requester can generate a proof to reject a worker’s submission within only a few milliseconds, which costs nearly 2 minutes if using generic zk-proof. Second, the concretely efficient constructions also save in memory usage. For example, by generic zk-proof, rejecting a submission requires a peak memory usage of 10 GB, which is reduced to only 53 MB by concrete constructions.

On-chain costs. We measure the critical on-chain performance from many angles including the cost of verifying zk-proofs and the on-chain gas usage of the whole protocol.

First, we compare the verifying cost of concrete constructions and generic zk-proofs for and (six golden standards) in Table II. The concrete proof is fast, even compared to generic zk-proof (SNARK) known for efficient verification. For example, in the case of ImageNet task, only 6ms is needed to verify each concrete proof.

Statement to Verify Verifying Time
Ours     1 ms
    2 ms
Generic ZKP     11 ms
    17 ms

The evaluations for generic ZKP (SNARK) are performed due to constructions from 2048-bit RSA-OAEP over instead of ElGamal o