Data Poisoning Attacks and Defenses to Crowdsourcing Systems

by   Minghong Fang, et al.

A key challenge of big data analytics is how to collect a large volume of (labeled) data. Crowdsourcing aims to address this challenge via aggregating and estimating high-quality data (e.g., sentiment label for text) from pervasive clients/users. Existing studies on crowdsourcing focus on designing new methods to improve the aggregated data quality from unreliable/noisy clients. However, the security aspects of such crowdsourcing systems remain under-explored to date. We aim to bridge this gap in this work. Specifically, we show that crowdsourcing is vulnerable to data poisoning attacks, in which malicious clients provide carefully crafted data to corrupt the aggregated data. We formulate our proposed data poisoning attacks as an optimization problem that maximizes the error of the aggregated data. Our evaluation results on one synthetic and two real-world benchmark datasets demonstrate that the proposed attacks can substantially increase the estimation errors of the aggregated data. We also propose two defenses to reduce the impact of malicious clients. Our empirical results show that the proposed defenses can substantially reduce the estimation errors of the data poisoning attacks.


page 1

page 2

page 3

page 4


FLCert: Provably Secure Federated Learning against Poisoning Attacks

Due to its distributed nature, federated learning is vulnerable to poiso...

Poisoning Attacks to Local Differential Privacy Protocols for Key-Value Data

Local Differential Privacy (LDP) protocols enable an untrusted server to...

Data Poisoning Attacks to Local Differential Privacy Protocols

Local Differential Privacy (LDP) protocols enable an untrusted data coll...

Linear Scalarization for Byzantine-robust learning on non-IID data

In this work we study the problem of Byzantine-robust learning when data...

Manipulating Federated Recommender Systems: Poisoning with Synthetic Users and Its Countermeasures

Federated Recommender Systems (FedRecs) are considered privacy-preservin...

Detecting adversaries in Crowdsourcing

Despite its successes in various machine learning and data science tasks...

Identifying Malicious Players in GWAP-based Disaster Monitoring Crowdsourcing System

Disaster monitoring is challenging due to the lake of infrastructures in...

Please sign up or login with your details

Forgot password? Click here to reset