Certified Computation in Crowdsourcing

09/12/2017
by   Themis Gouleakis, et al.
0

A wide range of learning tasks require human input in labeling massive data. For this labeling and because of the quantity of the data, crowdsourcing has become very crucial. The collected data though are usually low quality as workers are not appropriately incentivized to put effort. Significant research has focused on designing appropriate incentive schemes but do not capture the full range of tasks that appear in practice. Moreover, even if incentives are theoretically aligned noise in the data can still exist because it is hard to model exactly how workers will behave. In this work, we provide a generic approach that is based on verification of only few worker reports to guarantee high quality learning outcomes for various optimization objectives. Our method, identifies small sets of critical workers and verifies their reports. We show that many problems only need poly(1/ε) verifications, to ensure that the output of the computation is at most a factor of (1 ±ε) away from the truth. For any given instance, we provide an instance optimal solution that verifies the minimum number of workers possible to approximately certify correctness. In case this certification step fails, a misreporting worker will be identified. Removing these workers and repeating until success, guarantees that the result will be correct and will depend only on the verified workers. Surprisingly, as we show, for several computation tasks more efficient methods are possible. These methods always guarantee that the produced result is not affected by the misreporting workers, since any misreport that affects the output will be detected and verified.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset