Jacobian Regularization for Mitigating Universal Adversarial Perturbations

04/21/2021
by   CK, et al.
13

Universal Adversarial Perturbations (UAPs) are input perturbations that can fool a neural network on large sets of data. They are a class of attacks that represents a significant threat as they facilitate realistic, practical, and low-cost attacks on neural networks. In this work, we derive upper bounds for the effectiveness of UAPs based on norms of data-dependent Jacobians. We empirically verify that Jacobian regularization greatly increases model robustness to UAPs by up to four times whilst maintaining clean performance. Our theoretical analysis also allows us to formulate a metric for the strength of shared adversarial perturbations between pairs of inputs. We apply this metric to benchmark datasets and show that it is highly correlated with the actual observed robustness. This suggests that realistic and practical universal attacks can be reliably mitigated without sacrificing clean accuracy, which shows promise for the robustness of machine learning systems.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/16/2021

Real-time Detection of Practical Universal Adversarial Perturbations

Universal Adversarial Perturbations (UAPs) are a prominent class of adve...
research
08/07/2019

Robust Learning with Jacobian Regularization

Design of reliable systems must guarantee stability against input pertur...
research
09/24/2021

Local Intrinsic Dimensionality Signals Adversarial Perturbations

The vulnerability of machine learning models to adversarial perturbation...
research
04/19/2017

Universal Adversarial Perturbations Against Semantic Image Segmentation

While deep learning is remarkably successful on perceptual tasks, it was...
research
02/03/2023

Beyond the Universal Law of Robustness: Sharper Laws for Random Features and Neural Tangent Kernels

Machine learning models are vulnerable to adversarial perturbations, and...
research
09/20/2018

Playing the Game of Universal Adversarial Perturbations

We study the problem of learning classifiers robust to universal adversa...
research
07/26/2018

A general metric for identifying adversarial images

It is well known that a determined adversary can fool a neural network b...

Please sign up or login with your details

Forgot password? Click here to reset