Crowdsourcing on Sensitive Data with Privacy-Preserving Text Rewriting

03/06/2023
by   Nina Mouhammad, et al.
0

Most tasks in NLP require labeled data. Data labeling is often done on crowdsourcing platforms due to scalability reasons. However, publishing data on public platforms can only be done if no privacy-relevant information is included. Textual data often contains sensitive information like person names or locations. In this work, we investigate how removing personally identifiable information (PII) as well as applying differential privacy (DP) rewriting can enable text with privacy-relevant information to be used for crowdsourcing. We find that DP-rewriting before crowdsourcing can preserve privacy while still leading to good label quality for certain tasks and data. PII-removal led to good label quality in all examined tasks, however, there are no privacy guarantees given.

READ FULL TEXT
research
12/16/2017

One-sided Differential Privacy

In this paper, we study the problem of privacy-preserving data sharing, ...
research
02/03/2022

Privacy-Aware Crowd Labelling for Machine Learning Tasks

The extensive use of online social media has highlighted the importance ...
research
07/10/2020

From Task Tuning to Task Assignment in Privacy-Preserving Crowdsourcing Platforms

Specialized worker profiles of crowdsourcing platforms may contain a lar...
research
02/15/2023

DP-BART for Privatized Text Rewriting under Local Differential Privacy

Privatized text rewriting with local differential privacy (LDP) is a rec...
research
04/09/2018

An Efficient Privacy-Preserving Algorithm based on Randomized Response in IoT-based Smart Grid

Among existing privacy-preserving approaches, Differential Privacy (DP) ...
research
07/07/2023

Random Number Generators and Seeding for Differential Privacy

Differential Privacy (DP) relies on random numbers to preserve privacy, ...
research
04/18/2021

Why Should I Trust a Model is Private? Using Shifts in Model Explanation for Evaluating Privacy-Preserving Emotion Recognition Model

Privacy preservation is a crucial component of any real-world applicatio...

Please sign up or login with your details

Forgot password? Click here to reset