Generating Natural Adversarial Examples

10/31/2017
by   Zhengli Zhao, et al.
0

Due to their complex nature, it is hard to characterize the ways in which machine learning models can misbehave or be exploited when deployed. Recent work on adversarial examples, i.e. inputs with minor perturbations that result in substantially different model predictions, is helpful in evaluating the robustness of these models by exposing the adversarial scenarios where they fail. However, these malicious perturbations are often unnatural, not semantically meaningful, and not applicable to complicated domains such as language. In this paper, we propose a framework to generate natural and legible adversarial examples by searching in semantic space of dense and continuous data representation, utilizing the recent advances in generative adversarial networks. We present generated adversaries to demonstrate the potential of the proposed approach for black-box classifiers in a wide range of applications such as image classification, textual entailment, and machine translation. We include experiments to show that the generated adversaries are natural, legible to humans, and useful in evaluating and analyzing black-box classifiers.

READ FULL TEXT

page 2

page 4

page 7

research
09/08/2018

Structure-Preserving Transformation: Generating Diverse and Transferable Adversarial Examples

Adversarial examples are perturbed inputs designed to fool machine learn...
research
07/11/2020

ManiGen: A Manifold Aided Black-box Generator of Adversarial Examples

Machine learning models, especially neural network (NN) classifiers, hav...
research
06/23/2018

On Adversarial Examples for Character-Level Neural Machine Translation

Evaluating on adversarial examples has become a standard procedure to me...
research
03/25/2020

Plausible Counterfactuals: Auditing Deep Learning Classifiers with Realistic Adversarial Examples

The last decade has witnessed the proliferation of Deep Learning models ...
research
03/24/2017

Adversarial Examples for Semantic Segmentation and Object Detection

It has been well demonstrated that adversarial examples, i.e., natural i...
research
12/14/2017

DANCin SEQ2SEQ: Fooling Text Classifiers with Adversarial Text Example Generation

Machine learning models are powerful but fallible. Generating adversaria...
research
01/31/2020

Additive Tree Ensembles: Reasoning About Potential Instances

Imagine being able to ask questions to a black box model such as "Which ...

Please sign up or login with your details

Forgot password? Click here to reset