DeepAI AI Chat
Log In Sign Up

Early Methods for Detecting Adversarial Images

by   Dan Hendrycks, et al.
Toyota Technological Institute at Chicago

Many machine learning classifiers are vulnerable to adversarial perturbations. An adversarial perturbation modifies an input to change a classifier's prediction without causing the input to seem substantially different to human perception. We deploy three methods to detect adversarial images. Adversaries trying to bypass our detectors must make the adversarial image less pathological or they will fail trying. Our best detection method reveals that adversarial images place abnormal emphasis on the lower-ranked principal components from PCA. Other detectors and a colorful saliency map are in an appendix.


page 2

page 3

page 7

page 8


Detecting Adversarial Perturbations with Saliency

In this paper we propose a novel method for detecting adversarial exampl...

Adversarial Perturbations Fool Deepfake Detectors

This work uses adversarial perturbations to enhance deepfake images and ...

Adversarial Images through Stega Glasses

This paper explores the connection between steganography and adversarial...

A study of the effect of JPG compression on adversarial images

Neural network image classifiers are known to be vulnerable to adversari...

Attacking Object Detectors via Imperceptible Patches on Background

Deep neural networks have been proven vulnerable against adversarial per...

Color and Edge-Aware Adversarial Image Perturbations

Adversarial perturbation of images, in which a source image is deliberat...

Code Repositories


Code for the Adversarial Image Detectors and a Saliency Map

view repo