Semi-verified Learning from the Crowd with Pairwise Comparisons
We study the problem of crowdsourced PAC learning of Boolean-valued functions through enriched queries, a problem that has attracted a surge of recent research interests. In particular, we consider that the learner may query the crowd to obtain a label of a given instance or a comparison tag of a pair of instances. This is a challenging problem and only recently have budget-efficient algorithms been established for the scenario where the majority of the crowd are correct. In this work, we investigate the significantly more challenging case that the majority are incorrect which renders learning impossible in general. We show that under the semi-verified model of Charikar et al. (2017), where we have (limited) access to a trusted oracle who always returns the correct annotation, it is possible to learn the underlying function while the labeling cost is significantly mitigated by the enriched and more easily obtained queries.
READ FULL TEXT