On the Limitation of MagNet Defense against L_1-based Adversarial Examples

04/14/2018
by   Pei-Hsuan Lu, et al.
0

In recent years, defending adversarial perturbations to natural examples in order to build robust machine learning models trained by deep neural networks (DNNs) has become an emerging research field in the conjunction of deep learning and security. In particular, MagNet consisting of an adversary detector and a data reformer is by far one of the strongest defenses in the black-box oblivious attack setting, where the attacker aims to craft transferable adversarial examples from an undefended DNN model to bypass an unknown defense module deployed on the same DNN model. Under this setting, MagNet can successfully defend a variety of attacks in DNNs, including the high-confidence adversarial examples generated by the Carlini and Wagner's attack based on the L_2 distortion metric. However, in this paper, under the same attack setting we show that adversarial examples crafted based on the L_1 distortion metric can easily bypass MagNet and mislead the target DNN image classifiers on MNIST and CIFAR-10. We also provide explanations on why the considered approach can yield adversarial examples with superior attack performance and conduct extensive experiments on variants of MagNet to verify its lack of robustness to L_1 distortion based attacks. Notably, our results substantially weaken the assumption of effective threat models on MagNet that require knowing the deployed defense technique when attacking DNNs (i.e., the gray-box attack setting).

READ FULL TEXT
research
04/12/2019

Cycle-Consistent Adversarial GAN: the integration of adversarial attack and defense

In image classification of deep learning, adversarial examples where inp...
research
02/08/2016

Practical Black-Box Attacks against Machine Learning

Machine learning (ML) models, e.g., deep neural networks (DNNs), are vul...
research
12/30/2019

Defending from adversarial examples with a two-stream architecture

In recent years, deep learning has shown impressive performance on many ...
research
10/06/2020

A Panda? No, It's a Sloth: Slowdown Attacks on Adaptive Multi-Exit Neural Network Inference

Recent increases in the computational demands of deep neural networks (D...
research
10/14/2020

GreedyFool: An Imperceptible Black-box Adversarial Example Attack against Neural Networks

Deep neural networks (DNNs) are inherently vulnerable to well-designed i...
research
10/30/2017

Attacking the Madry Defense Model with L_1-based Adversarial Examples

The Madry Lab recently hosted a competition designed to test the robustn...
research
02/01/2020

Towards Sharper First-Order Adversary with Quantized Gradients

Despite the huge success of Deep Neural Networks (DNNs) in a wide spectr...

Please sign up or login with your details

Forgot password? Click here to reset