Identifying Terms and Conditions Important to Consumers using Crowdsourcing

11/23/2021
by   Xingyu Liu, et al.
0

Terms and conditions (T Cs) are pervasive on the web and often contain important information for consumers, but are rarely read. Previous research has explored methods to surface alarming privacy policies using manual labelers, natural language processing, and deep learning techniques. However, this prior work used pre-determined categories for annotations, and did not investigate what consumers really deem as important from their perspective. In this paper, we instead combine crowdsourcing with an open definition of "what is important" in T Cs. We present a workflow consisting of pairwise comparisons, agreement validation, and Bradley-Terry rank modeling, to effectively establish rankings of T C statements from non-expert crowdworkers on this open definition, and further analyzed consumers' preferences. We applied this workflow to 1,551 T C statements from 27 e-commerce websites, contributed by 3,462 unique crowd workers doing 203,068 pairwise comparisons, and conducted thematic and readability analysis on the statements considered as important/unimportant. We found that consumers especially cared about policies related to after-sales and money, and tended to regard harder-to-understand statements as more important. We also present machine learning models to identify T C clauses that consumers considered important, achieving at best a 92.7 recall, and 89.2 efficiently and reliably highlight important T Cs on websites at a large scale, improving consumers' awareness

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/06/2018

Analyzing Privacy Policies Using Contextual Integrity Annotations

In this paper, we demonstrate the effectiveness of using the theory of c...
research
05/14/2020

Can The Crowd Identify Misinformation Objectively? The Effects of Judgment Scale and Assessor's Background

Truthfulness judgments are a fundamental step in the process of fighting...
research
08/13/2020

The COVID-19 Infodemic: Can the Crowd Judge Recent Misinformation Objectively?

Misinformation is an ever increasing problem that is difficult to solve ...
research
05/09/2022

TinyGenius: Intertwining Natural Language Processing with Microtask Crowdsourcing for Scholarly Knowledge Graph Creation

As the number of published scholarly articles grows steadily each year, ...
research
10/13/2022

PoliGraph: Automated Privacy Policy Analysis using Knowledge Graphs

Privacy policies disclose how an organization collects and handles perso...
research
08/03/2021

The Many Dimensions of Truthfulness: Crowdsourcing Misinformation Assessments on a Multidimensional Scale

Recent work has demonstrated the viability of using crowdsourcing as a t...
research
05/25/2018

Modeling Language Vagueness in Privacy Policies using Deep Neural Networks

Website privacy policies are too long to read and difficult to understan...

Please sign up or login with your details

Forgot password? Click here to reset