Luring of Adversarial Perturbations

04/10/2020
by   Rémi Bernhard, et al.
0

The growing interest for adversarial examples, i.e. maliciously modified examples which fool a classifier, has resulted in many defenses intended to detect them, render them inoffensive or make the model more robust against them. In this paper, we pave the way towards a new approach to defend a distant system against adversarial examples, which we name the luring of adversarial perturbations. A component is included in the target model to form an augmented and equally accurate version of it. This additional component is designed to be removable and to give false indications on the way to fool the target model alone: the adversary is tricked into fooling the augmented version of the target model, and not the target model. We explain the intuition of our defense with the principle of the luring effect, inspired by the notion of robust and non-robust features, and experimentally justify its validity. Eventually, we propose a simple prediction strategy which takes advantage of this effect, and show that our defense scheme on MNIST, SVHN and CIFAR10 can efficiently thwart an adversary using state-of-the-art attacks and allowed to perform large perturbations.

READ FULL TEXT

page 3

page 4

page 6

research
06/28/2020

Geometry-Inspired Top-k Adversarial Perturbations

State-of-the-art deep learning models are untrustworthy due to their vul...
research
04/01/2019

Regional Homogeneity: Towards Learning Transferable Universal Adversarial Perturbations Against Defenses

This paper focuses on learning transferable adversarial examples specifi...
research
08/14/2022

Friendly Noise against Adversarial Noise: A Powerful Defense against Data Poisoning Attacks

A powerful category of data poisoning attacks modify a subset of trainin...
research
12/13/2022

Adversarially Robust Video Perception by Seeing Motion

Despite their excellent performance, state-of-the-art computer vision mo...
research
02/10/2021

Adversarial Robustness: What fools you makes you stronger

We prove an exponential separation for the sample complexity between the...
research
11/02/2017

Provable defenses against adversarial examples via the convex outer adversarial polytope

We propose a method to learn deep ReLU-based classifiers that are provab...
research
07/01/2019

Diminishing the Effect of Adversarial Perturbations via Refining Feature Representation

Deep neural networks are highly vulnerable to adversarial examples, whic...

Please sign up or login with your details

Forgot password? Click here to reset