Adversarial Robustness through the Lens of Causality

06/11/2021
by   Yonggang Zhang, et al.
12

The adversarial vulnerability of deep neural networks has attracted significant attention in machine learning. From a causal viewpoint, adversarial attacks can be considered as a specific type of distribution change on natural data. As causal reasoning has an instinct for modeling distribution change, we propose to incorporate causality into mitigating adversarial vulnerability. However, causal formulations of the intuition of adversarial attack and the development of robust DNNs are still lacking in the literature. To bridge this gap, we construct a causal graph to model the generation process of adversarial examples and define the adversarial distribution to formalize the intuition of adversarial attacks. From a causal perspective, we find that the label is spuriously correlated with the style (content-independent) information when an instance is given. The spurious correlation implies that the adversarial distribution is constructed via making the statistical conditional association between style information and labels drastically different from that in natural distribution. Thus, DNNs that fit the spurious correlation are vulnerable to the adversarial distribution. Inspired by the observation, we propose the adversarial distribution alignment method to eliminate the difference between the natural distribution and the adversarial distribution. Extensive experiments demonstrate the efficacy of the proposed method. Our method can be seen as the first attempt to leverage causality for mitigating adversarial vulnerability.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/25/2022

Causal Information Bottleneck Boosts Adversarial Robustness of Deep Neural Network

The information bottleneck (IB) method is a feasible defense solution ag...
research
05/25/2023

IDEA: Invariant Causal Defense for Graph Adversarial Robustness

Graph neural networks (GNNs) have achieved remarkable success in various...
research
08/20/2021

AdvDrop: Adversarial Attack to DNNs by Dropping Information

Human can easily recognize visual objects with lost information: even lo...
research
05/24/2022

Certified Robustness Against Natural Language Attacks by Causal Intervention

Deep learning models have achieved great success in many fields, yet the...
research
02/12/2023

TextDefense: Adversarial Text Detection based on Word Importance Entropy

Currently, natural language processing (NLP) models are wildly used in v...
research
12/17/2020

A Hierarchical Feature Constraint to Camouflage Medical Adversarial Attacks

Deep neural networks (DNNs) for medical images are extremely vulnerable ...
research
06/15/2021

Probabilistic Margins for Instance Reweighting in Adversarial Training

Reweighting adversarial data during training has been recently shown to ...

Please sign up or login with your details

Forgot password? Click here to reset