Analysis of Dominant Classes in Universal Adversarial Perturbations

12/28/2020
by   Jon Vadillo, et al.
17

The reasons why Deep Neural Networks are susceptible to being fooled by adversarial examples remains an open discussion. Indeed, many different strategies can be employed to efficiently generate adversarial attacks, some of them relying on different theoretical justifications. Among these strategies, universal (input-agnostic) perturbations are of particular interest, due to their capability to fool a network independently of the input in which the perturbation is applied. In this work, we investigate an intriguing phenomenon of universal perturbations, which has been reported previously in the literature, yet without a proven justification: universal perturbations change the predicted classes for most inputs into one particular (dominant) class, even if this behavior is not specified during the creation of the perturbation. In order to justify the cause of this phenomenon, we propose a number of hypotheses and experimentally test them using a speech command classification problem in the audio domain as a testbed. Our analyses reveal interesting properties of universal perturbations, suggest new methods to generate such attacks and provide an explanation of dominant classes, under both a geometric and a data-feature perspective.

READ FULL TEXT

page 6

page 14

page 15

page 16

research
04/26/2020

Enabling Fast and Universal Audio Adversarial Attack Using Generative Model

Recently, the vulnerability of DNN-based audio systems to adversarial at...
research
09/27/2022

FG-UAP: Feature-Gathering Universal Adversarial Perturbation

Deep Neural Networks (DNNs) are susceptible to elaborately designed pert...
research
01/23/2020

On the human evaluation of audio adversarial examples

Human-machine interaction is increasingly dependent on speech communicat...
research
06/19/2022

A Universal Adversarial Policy for Text Classifiers

Discovering the existence of universal adversarial perturbations had lar...
research
11/22/2019

Universal adversarial examples in speech command classification

Adversarial examples are inputs intentionally perturbed with the aim of ...
research
07/11/2022

Physical Passive Patch Adversarial Attacks on Visual Odometry Systems

Deep neural networks are known to be susceptible to adversarial perturba...
research
05/31/2021

Dominant Patterns: Critical Features Hidden in Deep Neural Networks

In this paper, we find the existence of critical features hidden in Deep...

Please sign up or login with your details

Forgot password? Click here to reset