PARL: Enhancing Diversity of Ensemble Networks to Resist Adversarial Attacks via Pairwise Adversarially Robust Loss Function

by   Manaar Alam, et al.

The security of Deep Learning classifiers is a critical field of study because of the existence of adversarial attacks. Such attacks usually rely on the principle of transferability, where an adversarial example crafted on a surrogate classifier tends to mislead the target classifier trained on the same dataset even if both classifiers have quite different architecture. Ensemble methods against adversarial attacks demonstrate that an adversarial example is less likely to mislead multiple classifiers in an ensemble having diverse decision boundaries. However, recent ensemble methods have either been shown to be vulnerable to stronger adversaries or shown to lack an end-to-end evaluation. This paper attempts to develop a new ensemble methodology that constructs multiple diverse classifiers using a Pairwise Adversarially Robust Loss (PARL) function during the training procedure. PARL utilizes gradients of each layer with respect to input in every classifier within the ensemble simultaneously. The proposed training procedure enables PARL to achieve higher robustness against black-box transfer attacks compared to previous ensemble methods without adversely affecting the accuracy of clean examples. We also evaluate the robustness in the presence of white-box attacks, where adversarial examples are crafted using parameters of the target classifier. We present extensive experiments using standard image classification datasets like CIFAR-10 and CIFAR-100 trained using standard ResNet20 classifier against state-of-the-art adversarial attacks to demonstrate the robustness of the proposed ensemble methodology.


Resisting Adversarial Attacks in Deep Neural Networks using Diverse Decision Boundaries

The security of deep learning (DL) systems is an extremely important fie...

Improving Adversarial Robustness of Ensembles with Diversity Training

Deep Neural Networks are vulnerable to adversarial attacks even in setti...

DVERGE: Diversifying Vulnerabilities for Enhanced Robust Generation of Ensembles

Recent research finds CNN models for image classification demonstrate ov...

Voting based ensemble improves robustness of defensive models

Developing robust models against adversarial perturbations has been an a...

Alternating Objectives Generates Stronger PGD-Based Adversarial Attacks

Designing powerful adversarial attacks is of paramount importance for th...

L2-Nonexpansive Neural Networks

This paper proposes a class of well-conditioned neural networks in which...

Robustness Verification for Classifier Ensembles

We give a formal verification procedure that decides whether a classifie...

Please sign up or login with your details

Forgot password? Click here to reset