Extending Adversarial Attacks to Produce Adversarial Class Probability Distributions

04/14/2020
by   Jon Vadillo, et al.
0

Despite the remarkable performance and generalization levels of deep learning models in a wide range of artificial intelligence tasks, it has been demonstrated that these models can be easily fooled by the addition of imperceptible but malicious perturbations to natural inputs. These altered inputs are known in the literature as adversarial examples. In this paper we propose a novel probabilistic framework to generalize and extend adversarial attacks in order to produce a desired probability distribution for the classes when we apply the attack method to a large number of inputs. This novel attack strategy provides the attacker with greater control over the target model, and increases the complexity of detecting that the model is being attacked. We introduce three different strategies to efficiently generate such attacks, and illustrate our approach extending DeepFool, a state-of-the-art attack algorithm to generate adversarial examples. We also experimentally validate our approach for the spoken command classification task, an exemplary machine learning problem in the audio domain. Our results demonstrate that we can closely approximate any probability distribution for the classes while maintaining a high fooling rate and by injecting imperceptible perturbations to the inputs.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/28/2018

Adversarial Reprogramming of Neural Networks

Deep neural networks are susceptible to adversarial attacks. In computer...
research
11/26/2019

Defending Against Adversarial Machine Learning

An Adversarial System to attack and an Authorship Attribution System (AA...
research
07/05/2021

When and How to Fool Explainable Models (and Humans) with Adversarial Examples

Reliable deployment of machine learning models such as neural networks c...
research
01/20/2021

Adversarial Attacks for Tabular Data: Application to Fraud Detection and Imbalanced Data

Guaranteeing the security of transactional systems is a crucial priority...
research
08/05/2021

BOSS: Bidirectional One-Shot Synthesis of Adversarial Examples

The design of additive imperceptible perturbations to the inputs of deep...
research
12/03/2020

Detecting Trojaned DNNs Using Counterfactual Attributions

We target the problem of detecting Trojans or backdoors in DNNs. Such mo...
research
02/13/2018

Identify Susceptible Locations in Medical Records via Adversarial Attacks on Deep Predictive Models

The surging availability of electronic medical records (EHR) leads to in...

Please sign up or login with your details

Forgot password? Click here to reset