Disrupting Model Training with Adversarial Shortcuts

06/12/2021
by   Ivan Evtimov, et al.
0

When data is publicly released for human consumption, it is unclear how to prevent its unauthorized usage for machine learning purposes. Successful model training may be preventable with carefully designed dataset modifications, and we present a proof-of-concept approach for the image classification setting. We propose methods based on the notion of adversarial shortcuts, which encourage models to rely on non-robust signals rather than semantic features, and our experiments demonstrate that these measures successfully prevent deep learning models from achieving high accuracy on real, unmodified data examples.

READ FULL TEXT

page 2

page 7

page 8

page 10

page 11

page 12

research
06/01/2023

Adversarial-Aware Deep Learning System based on a Secondary Classical Machine Learning Verification Approach

Deep learning models have been used in creating various effective image ...
research
07/28/2023

Adversarial training for tabular data with attack propagation

Adversarial attacks are a major concern in security-centered application...
research
07/05/2018

Privacy-preserving Machine Learning through Data Obfuscation

As machine learning becomes a practice and commodity, numerous cloud-bas...
research
01/30/2023

Benchmarking Robustness to Adversarial Image Obfuscations

Automated content filtering and moderation is an important tool that all...
research
06/08/2020

Provable trade-offs between private robust machine learning

Historically, machine learning methods have not been designed with secur...
research
05/08/2020

Blind Backdoors in Deep Learning Models

We investigate a new method for injecting backdoors into machine learnin...
research
12/03/2020

FairBatch: Batch Selection for Model Fairness

Training a fair machine learning model is essential to prevent demograph...

Please sign up or login with your details

Forgot password? Click here to reset