Saliency Diversified Deep Ensemble for Robustness to Adversaries

12/07/2021
by   Alex Bogun, et al.
18

Deep learning models have shown incredible performance on numerous image recognition, classification, and reconstruction tasks. Although very appealing and valuable due to their predictive capabilities, one common threat remains challenging to resolve. A specifically trained attacker can introduce malicious input perturbations to fool the network, thus causing potentially harmful mispredictions. Moreover, these attacks can succeed when the adversary has full access to the target model (white-box) and even when such access is limited (black-box setting). The ensemble of models can protect against such attacks but might be brittle under shared vulnerabilities in its members (attack transferability). To that end, this work proposes a novel diversity-promoting learning approach for the deep ensembles. The idea is to promote saliency map diversity (SMD) on ensemble members to prevent the attacker from targeting all ensemble members at once by introducing an additional term in our learning objective. During training, this helps us minimize the alignment between model saliencies to reduce shared member vulnerabilities and, thus, increase ensemble robustness to adversaries. We empirically show a reduced transferability between ensemble members and improved performance compared to the state-of-the-art ensemble defense against medium and high strength white-box attacks. In addition, we demonstrate that our approach combined with existing methods outperforms state-of-the-art ensemble algorithms for defense under white-box and black-box attacks.

READ FULL TEXT

page 7

page 11

page 15

page 16

page 17

page 18

research
05/17/2020

Toward Adversarial Robustness by Diversity in an Ensemble of Specialized Deep Neural Networks

We aim at demonstrating the influence of diversity in the ensemble of CN...
research
05/12/2020

Evaluating Ensemble Robustness Against Adversarial Attacks

Adversarial examples, which are slightly perturbed inputs generated with...
research
02/09/2021

"What's in the box?!": Deflecting Adversarial Attacks by Randomly Deploying Adversarially-Disjoint Models

Machine learning models are now widely deployed in real-world applicatio...
research
11/21/2021

Stochastic Variance Reduced Ensemble Adversarial Attack for Boosting the Adversarial Transferability

The black-box adversarial attack has attracted impressive attention for ...
research
03/29/2021

Automating Defense Against Adversarial Attacks: Discovery of Vulnerabilities and Application of Multi-INT Imagery to Protect Deployed Models

Image classification is a common step in image recognition for machine l...
research
04/23/2020

Ensemble Generative Cleaning with Feedback Loops for Defending Adversarial Attacks

Effective defense of deep neural networks against adversarial attacks re...
research
06/26/2020

Learning Diverse Latent Representations for Improving the Resilience to Adversarial Attacks

This paper proposes an ensemble learning model that is resistant to adve...

Please sign up or login with your details

Forgot password? Click here to reset