A Misreport- and Collusion-Proof Crowdsourcing Mechanism without Quality Verification

03/26/2020 ∙ by Kun Li, et al. ∙ Shandong University Beijing Normal University Indiana University 0

Quality control plays a critical role in crowdsourcing. The state-of-the-art work is not suitable for large-scale crowdsourcing applications, since it is a long haul for the requestor to verify task quality or select professional workers in a one-by-one mode. In this paper, we propose a misreport- and collusion-proof crowdsourcing mechanism, guiding workers to truthfully report the quality of submitted tasks without collusion by designing a mechanism, so that workers have to act the way the requestor would like. In detail, the mechanism proposed by the requester makes no room for the workers to obtain profit through quality misreport and collusion, and thus, the quality can be controlled without any verification. Extensive simulation results verify the effectiveness of the proposed mechanism. Finally, the importance and originality of our work lie in that it reveals some interesting and even counterintuitive findings: 1) a high-quality worker may pretend to be a low-quality one; 2) the rise of task quality from high-quality workers may not result in the increased utility of the requestor; 3) the utility of the requestor may not get improved with the increasing number of workers. These findings can boost forward looking and strategic planning solutions for crowdsourcing.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 10

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

The Internet openness renders crowdsourcing to gather geographically dispersed human resources for accomplishing complex tasks that are easy for human beings while difficult for machines. However, just also due to the openness of the Internet, any people can apply for participating in crowdsourcing, which may incur unprofessional crowdsourcees (workers) with low-quality contributions. Thus, there is a pressing need for quality control in crowdsourcing. The state-of-the-art quality control in crowdsourcing can be categorized into two kinds: task-based [1, 2, 3, 4] and worker-based [5, 6, 7, 8, 9]. The former kind is a direct way which proposes different approaches to evaluate or develops various tools to monitor the task quality. The latter kind is an indirect means that selects qualified workers to guarantee the quality of submissions.

Existing quality control is not applicable for large-scale crowdsourcing applications, such as urban traffic monitoring and air-quality sensing, since it is a long haul for the crowdsourcer (requestor) to verify task quality or select professional workers in a one-by-one mode. In this scenario, it is convenient to require workers to report the quality of their tasks based on which the requestor pays them. Such a naive way will obviously lead to two issues: 1) misreport. A worker may report his low-quality task as a high-quality one dishonestly. 2) collusion. High-quality workers111A high-quality worker refers to the one who submits the task with high quality. Accordingly, the low-quality worker is the one with the low-quality submitted tasks. can save cost by recruiting low-quality ones to work for them, through which the low-quality workers can also gain extra income.

The above two issues make it ridiculous for the requestor to pay relying on the quality reported by the workers themselves. However, the nature of requiring no quality verification renders such a way attractive to large-scale crowdsourcing applications. Hence, to make this seemly ridiculous way feasible, we propose a misreport- and collusion-proof crowdsourcing without quality verification in this paper. The aim of our scheme is guiding workers to truthfully report the quality of submitted tasks without collusion by leveraging pricing and task allocation. In other words, driven by the market power, workers have to try their best to serve the requestor honestly, making the quality of tasks can be naturally guaranteed.

However, it is challenging to realize our aim since the capability of a worker to complete the task is kept private to the requestor. This information asymmetry leads to the uncertainty of the requestor’s optimal strategy, which in turn causes the obscureness of that of any worker, resulting in that the Pareto optimality cannot be achieved in crowdsourcing. To tackle the above challenge, we resort to the mechanism design game theory

[10], which empowers the requestor to dominate the game with workers through designing a mechanism (game rule), so that workers have to act the way the requestor would like. In detail, the mechanism proposed by the requester makes no room for the workers to obtain profit through quality misreport and collusion, and thus, the quality can be controlled without any verification.

To the best of our knowledge, this is the first work that simultaneously guards against quality misreport and collusion among workers in crowdsourcing. The contributions in our paper are summarized as follows:

  • A crowdsourcing framework without quality verification is proposed, where the mechanism design game theory is leveraged to guide workers to behave honestly.

  • A special crowdsourcing mechanism for the two-worker model is designed, which includes three kinds of constraints: the participation-incentive constraint, the incentive compatibility constraint and the collusion-proof constraint. In addition, the property of the cost function for any worker is deduced.

  • A general crowdsourcing mechanism for multiple workers is extended from the basic two-worker model, which has a particularly-designed collusion-proof constraint for the optimal collusion scheme in the multiple-worker scenario.

  • Extensive simulation results verify the effectiveness of the proposed misreport- and collusion-proof crowdsourcing mechanism.

Finally, the importance and originality of our work lie in that it reveals some interesting and even counterintuitive findings, providing fresh insights into understanding the complexity of crowdsourcing. These findings can boost forward looking and strategic planning solutions, which are summarized as follows:

  • A high-quality worker may pretend to be a low-quality one. The reason behind this anti-intuitive fact is when the number of low-quality workers is large, or when the task quality of high-quality workers enhances, the profit of misreport by low-quality workers increases. Thus, the requestor has to use high payment to drive low-quality workers to behave honestly, which in turn creates the motivation for high-quality workers to lie.

  • The rise of task quality from high-quality workers may not result in the increased utility of the requestor. When the task quality of low-quality workers remains unchanged, the increase of task quality from high-quality workers implies the profits of collusion and misreport go up, leading to more incentives for malicious behaviors. In this case, the requestor has to pay more to reduce the possibility of misreport and collusion, thus may reduce her222In this paper, we use “she” and “he” to indicate the requestor and the worker, respectively. utility.

  • The utility of the requestor may not get improved with the increasing number of workers. Specifically, more workers mean more chances of collusion and misreport, leading to high cost for the requestor to prevent the occurrence of these problems, which may lower her utility.

The rest of this paper proceeds as follows. In Section 2, we summarize the related work on quality control in crowdsourcing. An overview of our proposed quality control framework is presented in Section 3, which is specifically elaborated in Section 4 for the basic two-worker model and further extended in Section 5 for the general -worker scenario. In Section 6, we conduct substantial simulation experiments to evaluate the performance of our proposed mechanism and reveal several interesting findings. Finally, we conclude the whole paper in Section 7.

2 Related work

In recent years, as crowdsourcing is widely used in various fields, quality control has become the focus of research in the crowdsourcing area. Existing research on quality control can be generally classified into two types:

task-based and worker-based.

Task-based quality control in crowdsourcing is to assess the quality of the tasks submitted by workers through various indicators. One of the most common methods is to use gold standard data [1] to measure the quality of tasks. By comparing with standard data, it is easy to distinguish unqualified tasks from qualified ones. In photo crowdsourcing, Wu et al. [2] found it challenging to use limited resources to get photos covering the target area as much as possible, so they adopted a scheme named image quality assessment (IQA) to address this challenge, which was deployed in mobile devices to filter out those low-quality photos before sending metadata to the server thereby achieving quality control. And when investigating crowdsourcing-based spammers, Xu et al. [3]

revealed that the quality control was conducted through strictly setting the standard to check whether the work done by the spammer was qualified to be paid or not, which enforced a low payment rate of 22.6% so as to stimulate spammers to improve the quality of their spam posts. However, in most crowdsourcing scenarios, it is impractical or unnecessary to prepare gold standard data in advance, so researchers proposed some machine-learning based methods to assess task quality. In

[4], Yuan et al. tried to improve data quality in the process of crowdsourcing-based labeled data collection through extending a classic probabilistic model named GLAD [11], which systematically encoded different types of side information, such as worker, item and context information.

Worker-based quality control is often achieved by selecting the appropriate set of workers with the help of some well-designed models or algorithms based on the workers’ ability or reputation. In [5], in order to slect high-quality workers, Qiu et al. designed a dynamic contract for each worker based on the worker’s performance and objective, so as to elicit high-quality submissions and prevent malicious behaviors. Hu et al. [6] used an economics-based philosophy to improve workers’ quality in crowdsourcing. Taking advantage of their proposed incentive algorithms based on the sequential zero-determinant strategy, the requestor could use the market power to stimulate workers to submit high-quality results. Han et al. [7] presented a new crowdsoucing system to provide annotations for web information extraction, which could collect a wide set of behavioral features and predict annotation quality of each worker for annotating web page structure. They collected a set of workers’ behavioral features and discovered the relationship between the crowdsourcing quality and the workers’ behavioral features. In vehicle-based crowdsourcing [8], the relationship between spatiotemporal coverage and the vehicle trajectory was studied to design a new strategy for worker recruitment, which guaranteed the crowdsoucing quality by employing the best set of participants meeting the application requirements. In [9], Tarable et al. proposed a “maximum a-posteriori” decision rule to help the requestor make decisions in reputation-based task allocation for crowdsourcing, which worked well even in the case of inaccurate reputation information.

In summary, all the above quality-control schemes are relying on additional assessing indicators or algorithms to select tasks or workers in a one-by-one manner, which are obviously tedious and inefficient, leading to their inapplicability in large-scale crowdsourcing scenarios. Therefore, we propose a simple but effective quality control model in crowdsourcing, which can further eliminate the collusion behavior at the same time.

3 Overview of our framework

According to the analysis in Section 1, achieving quality control without verification will lead to the problems of quality misreport and collusion. Both problems originate from that the worker’s capability to complete tasks is his private information in the absence of quality verification. Without this information, the optimal strategy of the requestor is uncertain, and so is that of any worker. To optimize the incomplete-information game between the requestor and workers, we take advantage of the mechanism design game theory [10].

The mechanism design game theory is an efficient vehicle to solve the game with private information, which introduces two kinds of players, namely “the agent” and “the principal”. The agent has private information called “type” while the principle does not have and is unable to access the private information of the agents. To achieve the Pareto optimality, the mechanism design game theory allows the principal to dominate the game with the agents through designing a mechanism (game rule), which leads the agents to act the way the principal would like. Specifically, the principal asks the agents to report their types (called direct mechanism) or strategies made based on the private information (called indirect mechanism).333According to the revelation principle, for every Bayesian Nash equilibrium, there exists a Bayesian game with the same equilibrium outcome but in which players truthfully report types, i.e., both the direct mechanism and indirect mechanism produce the same results. In this way, the principal only needs to consider the information provided by the agents to develop the optimal strategy, so as to significantly reduce the complexity. Then the principal and the agents act in light of the mechanism the principal designed and gain the corresponding payoffs. Whether an agent reports the true type or strategy based on the true private information depends on if the mechanism formulated by the principal satisfies the incentive compatibility (IC) constraint. The IC constraint guarantees that the agents are motivated to behave in a manner consistent with the principal’s optimal strategy.

In our scenario, since workers have private information, they are agents whose types are their working capabilities. Correspondingly, the requestor is the principal who can design a mechanism with the IC constraint to force any worker to report the task quality based on his working capability truthfully, thus solving the problem of quality misreport. Additionally, the collusion-proof constraint is also included in the proposed mechanism to address the issue that multiple malicious workers collude to deceive the requestor. In this paper, we assume that malicious workers are rational and intelligent.

Fig. 1: Overview of our framework.

The overview of our framework is shown in Fig.1, which includes the following steps:

  1. The requestor publishes the tasks and the mechanism online.

  2. If any candidate worker accepts the tasks and the mechanism , he acknowledges with his task quality to the requestor. Otherwise, he just ignores the messages from the requestor.

  3. Once receiving feedback from workers, the requestor notifies them of the task allocation and pricing results.

  4. If the malicious workers decide to collude, they will negotiate the collusion process with each other. However, due to our collusion-proof mechanism, their optimal strategy is not to collude.

  5. After finishing the tasks, the worker submits them to the requestor according to the requirements of .

  6. Accordingly, the requestor pays each worker in light of the quantity and quality of the submitted tasks.

We will detail our mechanism for different scenarios in the following two sections.

4 Mechanism for the two-worker model

In this paper, we conduct the analysis following a principle of simple to complex. We first consider the scenario where there is one requestor, denoted by , and two workers, denoted by , in crowdsourcing. In our model, affected by task difficulty, equipment characteristics, personal capability and environment[12], we consider any worker completes tasks with quality , with and respectively representing high and low task qualities. Since it is difficult or costly for workers to change task qualities by adjusting their own capabilities or other factors such as the environment in the crowdsourcing process, the value of

is fixed. That is, a high-quality worker cannot complete low-quality tasks, and a low-quality worker is also unable to submit tasks with high quality. We denote the probability that a worker submits high quality tasks with

, and then . We assume the joint probability of task qualities submitted by the two workers as . For simplicity, let , where . Apparently, , , .

To take advantage of the market power to deter misreport and collusion, the requestor regulates the behavior of workers through pricing and task allocation, which are determined according to different combinations of task qualities claimed by the two workers. To make notations simple, we denote the cases where both workers claim , one claims but the other claims , and both claim as , and , respectively. In case , any can obtain the payment from the requestor at the working cost , where and are respectively the number of tasks allocated to each worker and the extra reward used to incentivize workers to participate in crowdsourcing honestly; is the unit payoff for each task, which is a function of task quality. In addition, is the cost function related to the task quality and the number of tasks which is the common knowledge.

Based on the prior beliefs, the requestor needs to propose a crowdsourcing mechanism to the workers, which maps any set of quality distribution into a set . The aim of is to maximize the expected utility of the requestor. That is,

(1)

In (1), is the payoff of the requestor obtained from the submissions of two workers in case .

Additionally, to realize the misreport- and collusion-proof crowdsourcing, should include the following constraints.

4.1 Participation-incentive and Ic constraints

In order to improve the completion rate and quality of tasks, the mechanism should make workers willing to participate in the crowdsourcing and report the task quality truthfully. That is, the participation-incentive and IC constraints should be satisfied in the mechanism . In the following, we first formulate the participation-incentive constraint and then IC one, where for simplicity444Our methodology can be applied to other forms of payoff function . The aim of simplifying here is to make readers understand our design easily..

The main idea of the participation-incentive constraint is to guarantee that a worker has no loss when he participates in crowdsourcing. Specifically, if a worker is with high quality, the other worker can be either high-quality or low-quality, corresponding to two cases and . Hence, the expected utility of a high-quality worker consists of two parts: utility in with probability and that in with probability . In order to motivate workers to participate in crowdsourcing, it is necessary to guarantee their expected utility no less than 0, that is,

(2)

Similarly, for a low-quality worker, the participation-incentive constraint is

(3)

Next, we introduce the IC constraint. As mentioned above, the flexibility in the task number and extra reward determined by the requestor can be used to deter any worker to lie. If a worker doesn’t report his task quality truthfully, the mechanism is designed to make him cannot get the utility as expected. That is to say, this mechanism makes the utility of an honest worker not less than that of a worker who misrepresents task quality. We take a high-quality worker as an example. When he reports truthfully, his expected utility is the linear combination of the utility in case and that in case . When he lies about task quality, the utility in case turns into that in case while the utility in case becomes that in . Hence, the IC constraint of a high-quality worker can be written as

(4)

Similarly, for a low-quality worker, the IC constraint is

(5)

It is worth noting that the IC constraint for the high-quality worker presented in (4) is indispensable even it seems very unlikely that a high-quality worker will pretend to be low-quality. In fact, as a rational and utility-driven player, once the high-quality worker finds it more profitable to be low-quality, he will definitely lie about his quality level to pursue utility maximization. This phenomenon is highly possible to happen when there is no guarantee of IC constraint for the high-quality worker. Because in this case, the requestor will put more consideration on how to avoid the problem of low-quality worker’s lying, that is, how to make the honest low-quality workers gain more payoff than the dishonest ones. And when the probability of high-quality submission is small, in order to prevent the low-quality worker from lying, the mechanism designed by the requestor will inevitably enforce greater number of tasks and larger amount of extra reward on the low-quality worker, which will in turn result in an unexpected situation where the high-quality worker will get more profit when lying. For example, when and , the optimal value of and can be expressed as , and

through linear regression

555Because any arbitrary set of responds to an optimal set of maximizing the utility of the requestor with corresponding constraints, we directly set the values of here for simplicity. In addition, the optimal set of here is solved with the participation-incentive constraint, IC constraint for the the low-quality worker and the collusion-proof constraint which we will discuss later, but without the IC constraint for the high-quality worker to identify the necessity of it.. Substituting these expressions into (4), we can find that when , the high-quality worker will choose to lie, which is obviously unexpected in practice. Thus, to eliminate this problem, we have to include the IC constraint for the high-quality worker as shown in (4).

4.2 Collusion-proof constraint

Generally, a common purpose of the two workers is to maximize their total expected utility, i.e.,

(6)

When the two workers are both high-quality, they obviously have no need to behave maliciously. While if both are low-quality, the IC constraint makes no room for them to pretend to be high-quality so that the requestor has to pay based on the low quality no matter who completes the tasks. In other words, there is no chance to collude when they are both low-quality workers. Hence, the collusion will only occur when the task qualities of the two workers are different, where the low-quality worker completes a part of or even the whole task which is supposed to be finished by the high-quality worker, so that the low-quality worker can earn more reward while the high-quality one is able to reduce the working cost. Such a malicious behavior cannot be prevented by the IC constraint, pressing a need to design the collusion-proof constraint for the mechanism , which is detailed as follows.

Without collusion, the total utility of the workers is . If the high-quality worker colludes with the low-quality one through assigning units of task to the low-quality worker to complete, their total utility would become ). In order to prevent collusion, our mechanism should be designed to make sure that their total utility without collusion will never be less than that with collusion. In other words, the following inequation should be satisfied

(7)

For ease of calculation and analysis, let . As is the cost required for a worker to complete tasks, the function should be an non-negative and non-decreasing function in .

Proposition 1.

The mechanism entails that if is a monotonically increasing function, i.e., , in , it is also a concave one, i.e., .

Proof.

According to (7),

(8)

If , that is, is monotonically increasing, then

(9)

Obviously, is a concave function, i.e., . ∎

Under the assumption of , we can see that the larger the value of (the number of collusive tasks), the smaller the workers’ cost. Therefore the workers’ optimal collusion strategy is that all the tasks of the high-quality worker are completed by the low-quality one, i.e., . In order to avoid collusion, our mechanism should satisfy the following constraint

(10)

(10) guarantees that even though the malicious workers adopt the optimal collusion strategy, the total expected utility of colluded workers cannot be larger than that when they behave honestly. Thus, collusion-proof is realized in crowdsourcing.

It should be noted that the assumption of is consistent with most practical scenarios, where as the number of tasks increases, the cost of workers grows.

5 Mechanism for the -worker model

In this section, we discuss the -worker () scenario which is a general case and can be derived from the above optimization process.

In a general model, there is one requestor and workers , whose task qualities are denoted as . We use case to represent the situation where there are high-quality workers among the workers. is the probability of case and it is easy to know that due to , where is the combination calculation . The mechanism proposed by the requestor maps any set of quality distribution into a binary group , where and are the task number and extra reward in case . In the case, spends and earns from the requestor. Similar to Section 4, the aim of is to maximize the requestor’s expected utility. That is,

(11)

In (11), is the payoff of the requestor obtained from the submissions of all workers in case and is the payment to worker in case .

To realize the misreport- and collusion-proof crowdsourcing, should meet the IC, the participation-incentive and the collusion-proof constraints, which are described in the following.

For any high-quality worker, his expected utility is related to the number of other high-quality workers in crowdsourcing. In detail, when there are total high-quality workers including himself, case occurs with the probability of . Thus, in case , the expected utility of this high-quality worker is . However, when the high-quality worker feigns a low-quality one, the requestor will consider there are high-quality workers, and hence, the number of tasks and extra reward to this worker are respectively and . Thus, the expected utility of this high-quality worker turns to be . Considering , the IC and the participation-incentive constraints can be formulated by (12) and (13) as follows, implying that the expected utility of a high-quality worker when he honestly participates in crowdsourcing is not less than that when he misreports the task quality to the requestor or even does not join in the crowdsourcing.

(12)
(13)

Similarly, for a low-quality worker, his IC and participation-incentive constraints are respectively written as

(14)
(15)

Next, we will analyze the issue of collusion. As in the case of two workers, collusion does not occur when the qualities of all workers are the same, so we only consider the collusion in the case where the workers’ task qualities are different. Take three workers as an example, we can find that collusion may occur in two cases, namely the case that there are 2 high-quality workers represented by , as well as the case that there is 1 high-quality worker represented by 666 does not indicate the task quality of any specific worker is low or high but represents all the scenarios where there are 2 high-quality workers and 1 low-quality one. So does .. In the case of , the total utility of all workers without collusion is . Assume that and are both the high-quality workers while is the low-quality one, we can find that the costs of and are both while ’s cost is . When and respectively require to complete and tasks, their costs change to collusion expense777The collusion expense is the reward the high-quality worker paid to the low-quality one in order to facilitate collusion which including the low-quality worker’s cost of completing the collusion tasks and the collusion bribe., collusion expense, and respectively. The workers’ total utility after collusion is . Through comparing the total utility before and after collusion, the collusion-proof constraint can be written as

(16)

Likewise, in the case of , if there is no collusion, the cost of is , while those of and are both . When requires and to complete and tasks respectively (), their costs change to collusion expense, and . The collusion-proof constraint can be written as

(17)

(16) and (17) guarantee that the total utility of all workers when they collude will not be greater than that when they behave honestly in the above two cases, so that the workers have no motivation to collude.

With the same assumption mentioned in Section 4, when is monotonically increasing, can be deduced with the similar method used above. In the case of , when collusion occurs, it is obvious that the total cost of the workers (i.e., the right part of the inequality) minimizes at , because the cost saved by collusion is the most. And in the case of , it is also obvious that the total cost minimizes at . According to the nature of the concave function , we have . So in this case, the total cost minimizes at . Therefore, (16) and (17) can be transferred to

(18)
(19)

which implies that the maximum utility of any worker that can be achieved by collusion is less than that without collusion. At this point, workers will give up collusion.

For workers, there are cases in total will incur collusion, which could be represented by . And in the case, the number of high-quality workers is , then the total utility of all workers in case before collusion is

(20)

Based on the analysis above, we still let and assume that is a monotonically increasing and concave function. When the workers collude, their optimal strategy is that all tasks of high-quality workers are equally assigned to low-quality workers due to the nature of concave function . Thus, the total utility of all the malicious workers after using the optimal collusion strategy is

(21)

Based on (20) and (21), the collusion-proof constraint for the case is

(22)

which implies even though the malicious workers adopt the optimal collusion strategy, their total utility is still not larger than that when they do not collude. Thus, the goal of collusion-proof can be achieved in the -worker model.

6 Performance evaluation

In this section, we analyze the impacts of some key parameters on the performance of our proposed crowdsourcing mechanism through extensive simulations.

6.1 Two workers

We first analyze the situation of two workers presented in Section 4. Basically, we utilize the following set of functions888We also test other function settings which present similar results. Hence, we do not present them for brevity. to calculate (1) to (5), and (10): , , , . Considering that the actual meaning of is the number of tasks assigned to each worker under three different cases , we restrict them as positive integers.

To begin with, we investigate the change laws of the requestor’s optimal utility in Fig. 2. As shown in Fig. 2(a), we present the optimal utility changing with different and given a fixed ; in Fig. 2(b), we report the change of the optimal utility with various and when is fixed. From these two figures, one can see that when or is settled, the requestor’s optimal utility increases with the increasing . The reason is that the greater is, the more high-quality tasks the requestor receives, which finally increases the requestor’s utility. Besides, when one of and is fixed, the larger the difference between high and low quality , the higher increment of the requestor’s optimal utility due to the change of . That is, when is increasing, has more influence on the requestor’s utility. The underlying reason is that the larger incurs the larger difference of utility brought by the high-quality worker and low-quality one, which will amplify the improvement of the optimal utility resulted from the increasing . Last but not least, for a specific , one can observe that the requestor’s optimal utility becomes smaller with the increasing , shown in Fig. 2(a), but comes to be larger with the increasing , presented in Fig. 2(b). This is because when is constant, the increase of implies that collusion and misreport turn to be more profitable, incentivizing more malicious behaviors from workers, which drives the requestor to spend more to eliminate these undesirable behaviors, leading to less utility for her. While when is given, the larger makes the profit of the low-quality worker’s lying become trivial, dispelling his enthusiasm of behaving maliciously, which enables the increase of the requestor’s optimal utility.

(a) .
(b) .
Fig. 2: The optimal utility of the requestor in different situations.

Next, to explore the relationships between and , , , we solve the optimization problem with fixed , and , respectively, and analyze the corresponding experimental results. In TABLE I, we report the changing trend of , and in different cases. The parameter settings in each case are clarified in this table, and the meanings of horizontal and vertical coordinates are specified in the corresponding figures.

From the first column of figures in TABLE I, we can see that the changing trend of has the following characteristics:

  1. [labelsep = .5em, leftmargin = 0pt, itemindent = 2.5em]

  2. As shown in the first two and the last two figures, the value of keeps stable or increasing with the increase of . This is because corresponds to the situation where both workers are of high quality, so the more likely the high-quality worker appears, the higher value of that the requestor needs to set to achieve the IC constraint for the high-quality worker and avoid any potential lie.

  3. In the middle two figures, as increases, remains stable or increases. The reason is that the increase of drives the requestor to increase for achieving the IC constraint.

  4. In the first two figures, when increases, one can find that remains unchanged or increases. To be specific, when is fixed, the increasing indicates the increase of , which makes the requestor enlarge to satisfy the IC constraint.

Through comparing the second column of figures, we can find that changes with the following features:

  1. [labelsep = .5em, leftmargin = 0pt, itemindent = 2.5em]

  2. From the first two figures and the last two ones, it is obvious that keeps unchanged or becomes larger with the increasing . This is because represents the task allocation in the case of one high-quality worker and one low-quality worker, which affects the IC constraint and the collision-proof constraint at the same time. As increases, the likelihood that a worker is a high-quality one increases, which makes the problem of high-quality worker’s lying more serious. This situation is very similar to , so the requestor needs to increase to guarantee the IC constraint.

  3. Observing the first two figures in this column, one can get the conclusion that owns a trend of decreasing first and then increasing with the increase of ; and in all the rest of figures in this column, first decreases and then increases with the increasing . This is because, on one hand, affects all constraints, making the requestor increase to achieve these constraints; on the other hand, the increase of will lead to more low-quality submissions for the requestor, lowering its utility, which is exactly opposite to the optimization direction. So finally shows a trend of decreasing first and then increasing. In detail, take the middle two figures as an example, with the fixed and , when is small, decreases as increases. In this case, due to the relatively small quality difference, the problems of workers’ lying and collusion are not so serious, then the requestor can improve its utility through reducing the number of low-quality tasks with a lower . When gradually increases, the problems of workers’ lying and collusion become more severe, then comes to the minimum. When increases to the maximum, the workers’ lying and collusion problem turns to be terrible, where the requestor needs to increase to satisfy all constraints. Therefore, presents a trend of decreasing first and then increasing.

Finally, we scrutinize the third column of figures for :

  1. [labelsep = .5em, leftmargin = 0pt, itemindent = 2.5em]

  2. As can be seen from all these figures, the overall change of is not obvious and remains generally unchanged. This is because, on the one hand, corresponds to the case of two low-quality workers, where increasing means getting more low-quality tasks, opposite to the optimization direction; on the other hand, reducing may result in failure to meet the IC constraint of the low-quality worker. These two conflicting factors are balanced in most cases, making relatively stable.

  3. As we can see from the first two and the last two figures, stays the same or decreases with the increase of . Because when increases, the possibility that both workers are low-quality ones is reduced, so that the requestor can appropriately decrease to improve her utility by receiving less low-quality tasks.

  4. From the figure with a lower , decreases first and then increases with the increasing , which is because the probability of two low-quality workers is high, so the two factors mentioned in the first item cannot be balanced. To be specific, when is small, the benefit of low-quality worker’s lying is low, where the cost for the requestor to meet the IC constraint is also small, so the requestor can lower to improve her utility. While when is large, the problem of the low-quality worker’s lying becomes more rigorous than the low utility due to many low-quality submissions, so the requestor will increase to satisfy the IC constraint. These two aspects generally make decrease first and then increase.

TABLE I: , and in different situations.

It is worth noting that since , and in the experiments present no obvious trend999The underlying reason is the limitation of positive integers for , and . And thus when changes in the integer dimension, the originally continuous will come to abrupt change so we can’t find obvious changing trend of , and ., we do not analyze them in detail.

6.2 Three workers

Similar to the situation of two workers, we utilize the following set of functions to calculate (12) to (15) and (22): , , , , . We still limit

to positive integers and do linear programming to solve the constrained optimization problem.

In Fig. 3, we compare the different trends of requestor’s utility changing with when the numbers of workers are two and three, respectively, with two different parameter settings of and . Both figures show that the requestor’s utility is positively related to . It is worth noting that according to the experimental results, the requestor’s utility does not increase with the increasing number of workers. To be specific, when in Fig. 3(a), the requestor’s optimal utility of the two-worker case is higher than that of the three-worker case, while when , three workers is much more better than two workers. This is because more workers incur more constraints in the optimization problem, which leads to that fewer workers sometimes can bring more utility to the requestor.

Note that the changing trends of utility with , , is similar to that of the two-worker situation, so we exclude these results due to the limitation of page length.

(a) .
(b) .
Fig. 3: Optimal utility comparison between two workers and three workers.

7 Conclusion

In this paper, we propose a misreport- and collusion-proof crowdsourcing mechanism, guiding workers to truthfully report the quality of submitted tasks without collusion by leveraging pricing and task allocation. Extensive simulation results verify the effectiveness of the proposed mechanism, based on which we can obtain three counterintuitive findings: 1) a high-quality worker may pretend to be a low-quality one; 2) the rise of task quality from high-quality workers may not result in the increased utility of the requestor; 3) the utility of the requestor may not get improved with the increasing number of workers. In addition, we can also draw the following conclusions: 1) with the increase of , and increase while decreases; 2) the utility of the requestor goes up with the increase of . Moreover, the higher the , the larger the increment of her utility.

References

  • [1] D. Oleson, A. Sorokin, G. Laughlin, V. Hester, J. Le, and L. Biewald, “Programmatic gold: Targeted and scalable quality assurance in crowdsourcing,” 2011.
  • [2] Y. Wu, Y. Wang, and G. Cao, “Photo crowdsourcing for area coverage in resource constrained environments,” pp. 1–9, 2017.
  • [3] A. Xu, X. Feng, and Y. Tian, “Revealing, characterizing, and detecting crowdsourcing spammers: A case study in community q&a,” pp. 2533–2541, 2015.
  • [4] Y. Jin, M. Carman, D. Kim, and L. Xie, “Leveraging side information to improve label quality control in crowd-sourcing,” pp. 1–10, 2017.
  • [5] C. Qiu, A. C. Squicciarini, S. M. Rajtmajer, and J. Caverlee, “Dynamic contract design for heterogenous workers in crowdsourcing for quality control,” pp. 1168–1177, 2017.
  • [6] Q. Hu, S. Wang, P. Ma, X. Cheng, W. Lv, and R. Bie, “Quality control in crowdsourcing using sequential zero-determinant strategies,” IEEE Transactions on Knowledge and Data Engineering, pp. 1–11, 2019.
  • [7] S. Han, P. Dai, P. Paritosh, and D. Huynh, “Crowdsourcing human annotation on web page structure: Infrastructure design and behavior-based quality control,” ACM Transactions on Intelligent Systems and Technology (TIST), vol. 7, no. 4, p. 56, 2016.
  • [8] Z. He, J. Cao, and X. Liu, “High quality participant recruitment in vehicle-based crowdsourcing using predictable mobility,” pp. 2542–2550, 2015.
  • [9] A. Tarable, A. Nordio, E. Leonardi, and M. A. Marsan, “The importance of being earnest in crowdsourcing systems,” pp. 2821–2829, 2015.
  • [10] T. Börgers, An introduction to the theory of mechanism design.   Oxford University Press, USA, 2015.
  • [11] J. Whitehill, T.-f. Wu, J. Bergsma, J. R. Movellan, and P. L. Ruvolo, “Whose vote should count more: Optimal integration of labels from labelers of unknown expertise,” pp. 2035–2043, 2009.
  • [12] D. Peng, F. Wu, and G. Chen, “Data quality guided incentive mechanism design for crowdsensing,” IEEE transactions on mobile computing, vol. 17, no. 2, pp. 307–319, 2018.