Early Methods for Detecting Adversarial Images

08/01/2016
by   Dan Hendrycks, et al.
0

Many machine learning classifiers are vulnerable to adversarial perturbations. An adversarial perturbation modifies an input to change a classifier's prediction without causing the input to seem substantially different to human perception. We deploy three methods to detect adversarial images. Adversaries trying to bypass our detectors must make the adversarial image less pathological or they will fail trying. Our best detection method reveals that adversarial images place abnormal emphasis on the lower-ranked principal components from PCA. Other detectors and a colorful saliency map are in an appendix.

READ FULL TEXT

page 2

page 3

page 7

page 8

research
03/23/2018

Detecting Adversarial Perturbations with Saliency

In this paper we propose a novel method for detecting adversarial exampl...
research
12/08/2020

Locally optimal detection of stochastic targeted universal adversarial perturbations

Deep learning image classifiers are known to be vulnerable to small adve...
research
10/15/2020

Adversarial Images through Stega Glasses

This paper explores the connection between steganography and adversarial...
research
08/02/2016

A study of the effect of JPG compression on adversarial images

Neural network image classifiers are known to be vulnerable to adversari...
research
02/16/2021

Just Noticeable Difference for Machine Perception and Generation of Regularized Adversarial Images with Minimal Perturbation

In this study, we introduce a measure for machine perception, inspired b...
research
09/16/2018

Attacking Object Detectors via Imperceptible Patches on Background

Deep neural networks have been proven vulnerable against adversarial per...
research
02/20/2018

On Lyapunov exponents and adversarial perturbation

In this paper, we would like to disseminate a serendipitous discovery in...

Please sign up or login with your details

Forgot password? Click here to reset