Benford's law: what does it say on adversarial images?

02/09/2021
by   João G. Zago, et al.
0

Convolutional neural networks (CNNs) are fragile to small perturbations in the input images. These networks are thus prone to malicious attacks that perturb the inputs to force a misclassification. Such slightly manipulated images aimed at deceiving the classifier are known as adversarial images. In this work, we investigate statistical differences between natural images and adversarial ones. More precisely, we show that employing a proper image transformation and for a class of adversarial attacks, the distribution of the leading digit of the pixels in adversarial images deviates from Benford's law. The stronger the attack, the more distant the resulting distribution is from Benford's law. Our analysis provides a detailed investigation of this new approach that can serve as a basis for alternative adversarial example detection methods that do not need to modify the original CNN classifier neither work on the raw high-dimensional pixels as features to defend against attacks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/04/2021

SpectralDefense: Detecting Adversarial Attacks on CNNs in the Fourier Domain

Despite the success of convolutional neural networks (CNNs) in many comp...
research
09/23/2020

Detection of Iterative Adversarial Attacks via Counter Attack

Deep neural networks (DNNs) have proven to be powerful tools for process...
research
05/29/2019

Functional Adversarial Attacks

We propose functional adversarial attacks, a novel class of threat model...
research
06/01/2019

Enhancing Transformation-based Defenses using a Distribution Classifier

Adversarial attacks on convolutional neural networks (CNN) have gained s...
research
12/10/2021

Preemptive Image Robustification for Protecting Users against Man-in-the-Middle Adversarial Attacks

Deep neural networks have become the driving force of modern image recog...
research
08/04/2023

Multi-attacks: Many images + the same adversarial attack → many target labels

We show that we can easily design a single adversarial perturbation P th...
research
11/20/2017

Verifying Neural Networks with Mixed Integer Programming

Neural networks have demonstrated considerable success in a wide variety...

Please sign up or login with your details

Forgot password? Click here to reset