A general metric for identifying adversarial images

07/26/2018
by   Siddharth Krishna Kumar, et al.
0

It is well known that a determined adversary can fool a neural network by making imperceptible adversarial perturbations to an image. Recent studies have shown that these perturbations can be detected even without information about the neural network if the strategy taken by the adversary is known beforehand. Unfortunately, these studies suffer from the generalization limitation -- the detection method has to be recalibrated every time the adversary changes his strategy. In this study, we attempt to overcome the generalization limitation by deriving a metric which reliably identifies adversarial images even when the approach taken by the adversary is unknown. Our metric leverages key differences between the spectra of clean and adversarial images when an image is treated as a matrix. Our metric is able to detect adversarial images across different datasets and attack strategies without any additional re-calibration. In addition, our approach provides geometric insights into several unanswered questions about adversarial perturbations.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/14/2017

On Detecting Adversarial Perturbations

Machine learning and deep learning in particular has advanced tremendous...
research
04/12/2019

Generating Minimal Adversarial Perturbations with Integrated Adaptive Gradients

We focus our attention on the problem of generating adversarial perturba...
research
03/04/2020

Metrics and methods for robustness evaluation of neural networks with generative models

Recent studies have shown that modern deep neural network classifiers ar...
research
08/05/2019

A principled approach for generating adversarial images under non-smooth dissimilarity metrics

Deep neural networks are vulnerable to adversarial perturbations: small ...
research
04/21/2021

Jacobian Regularization for Mitigating Universal Adversarial Perturbations

Universal Adversarial Perturbations (UAPs) are input perturbations that ...
research
10/16/2019

A New Defense Against Adversarial Images: Turning a Weakness into a Strength

Natural images are virtually surrounded by low-density misclassified reg...
research
03/16/2020

Adversarial Perturbations of Opinion Dynamics in Networks

We study the connections between network structure, opinion dynamics, an...

Please sign up or login with your details

Forgot password? Click here to reset