A Closer Look at the Adversarial Robustness of Information Bottleneck Models

07/12/2021
by   Iryna Korshunova, et al.
0

We study the adversarial robustness of information bottleneck models for classification. Previous works showed that the robustness of models trained with information bottlenecks can improve upon adversarial training. Our evaluation under a diverse range of white-box l_∞ attacks suggests that information bottlenecks alone are not a strong defense strategy, and that previous results were likely influenced by gradient obfuscation.

READ FULL TEXT

page 4

page 9

research
07/24/2021

Adversarial training may be a double-edged sword

Adversarial training has been shown as an effective approach to improve ...
research
02/11/2022

White-Box Attacks on Hate-speech BERT Classifiers in German with Explicit and Implicit Character Level Defense

In this work, we evaluate the adversarial robustness of BERT models trai...
research
07/01/2019

Comment on "Adv-BNN: Improved Adversarial Defense through Robust Bayesian Neural Network"

A recent paper by Liu et al. combines the topics of adversarial training...
research
08/21/2019

Testing Robustness Against Unforeseen Adversaries

Considerable work on adversarial defense has studied robustness to a fix...
research
06/11/2022

Improving the Adversarial Robustness of NLP Models by Information Bottleneck

Existing studies have demonstrated that adversarial examples can be dire...
research
08/21/2023

Measuring the Effect of Causal Disentanglement on the Adversarial Robustness of Neural Network Models

Causal Neural Network models have shown high levels of robustness to adv...
research
02/28/2021

Adversarial Information Bottleneck

The information bottleneck (IB) principle has been adopted to explain de...

Please sign up or login with your details

Forgot password? Click here to reset