Transitivity, Time Consumption, and Quality of Preference Judgments in Crowdsourcing

04/18/2021
by   Kai Hui, et al.
0

Preference judgments have been demonstrated as a better alternative to graded judgments to assess the relevance of documents relative to queries. Existing work has verified transitivity among preference judgments when collected from trained judges, which reduced the number of judgments dramatically. Moreover, strict preference judgments and weak preference judgments, where the latter additionally allow judges to state that two documents are equally relevant for a given query, are both widely used in literature. However, whether transitivity still holds when collected from crowdsourcing, i.e., whether the two kinds of preference judgments behave similarly remains unclear. In this work, we collect judgments from multiple judges using a crowdsourcing platform and aggregate them to compare the two kinds of preference judgments in terms of transitivity, time consumption, and quality. That is, we look into whether aggregated judgments are transitive, how long it takes judges to make them, and whether judges agree with each other and with judgments from TREC. Our key findings are that only strict preference judgments are transitive. Meanwhile, weak preference judgments behave differently in terms of transitivity, time consumption, as well as of quality of judgment.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/28/2022

Oracle Guided Image Synthesis with Relative Queries

Isolating and controlling specific features in the outputs of generative...
research
09/02/2014

Deontic modality based on preference

Deontic modalities are here defined in terms of the preference relation ...
research
09/21/2017

Assumption-Based Approaches to Reasoning with Priorities

This paper maps out the relation between different approaches for handli...
research
07/22/2020

Assessing top-k preferences

Assessors make preference judgments faster and more consistently than gr...
research
04/21/2023

Hear Me Out: A Study on the Use of the Voice Modality for Crowdsourced Relevance Assessments

The creation of relevance assessments by human assessors (often nowadays...
research
06/04/2015

The Preference Learning Toolbox

Preference learning (PL) is a core area of machine learning that handles...

Please sign up or login with your details

Forgot password? Click here to reset