Imbalanced Gradients: A New Cause of Overestimated Adversarial Robustness

06/24/2020
by   Linxi Jiang, et al.
1

Evaluating the robustness of a defense model is a challenging task in adversarial robustness research. Obfuscated gradients, a type of gradient masking, have previously been found to exist in many defense methods and cause a false signal of robustness. In this paper, we identify a more subtle situation called Imbalanced Gradients that can also cause overestimated adversarial robustness. The phenomenon of imbalanced gradients occurs when the gradient of one term of the margin loss dominates and pushes the attack towards to a suboptimal direction. To exploit imbalanced gradients, we formulate a Margin Decomposition (MD) attack that decomposes a margin loss into individual terms and then explores the attackability of these terms separately via a two-stage process. We examine 12 state-of-the-art defense models, and find that models exploiting label smoothing easily cause imbalanced gradients, and on which our MD attacks can decrease their PGD robustness (evaluated by PGD attack) by over 23 PGD robustness by at least 9 need to be carefully addressed for more reliable adversarial robustness.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/17/2020

Label Smoothing and Adversarial Robustness

Recent studies indicate that current adversarial attack methods are flaw...
research
07/26/2018

Evaluating and Understanding the Robustness of Adversarial Logit Pairing

We evaluate the robustness of Adversarial Logit Pairing, a recently prop...
research
06/03/2022

Gradient Obfuscation Checklist Test Gives a False Sense of Security

One popular group of defense techniques against adversarial attacks is b...
research
07/06/2021

GradDiv: Adversarial Robustness of Randomized Neural Networks via Gradient Diversity Regularization

Deep learning is vulnerable to adversarial examples. Many defenses based...
research
07/03/2018

Local Gradients Smoothing: Defense against localized adversarial attacks

Deep neural networks (DNNs) have shown vulnerability to adversarial atta...
research
06/25/2020

MTAdam: Automatic Balancing of Multiple Training Loss Terms

When training neural models, it is common to combine multiple loss terms...
research
07/21/2022

Switching One-Versus-the-Rest Loss to Increase the Margin of Logits for Adversarial Robustness

Defending deep neural networks against adversarial examples is a key cha...

Please sign up or login with your details

Forgot password? Click here to reset