Quantifying and Understanding Adversarial Examples in Discrete Input Spaces

12/12/2021
by   Volodymyr Kuleshov, et al.
0

Modern classification algorithms are susceptible to adversarial examples–perturbations to inputs that cause the algorithm to produce undesirable behavior. In this work, we seek to understand and extend adversarial examples across domains in which inputs are discrete, particularly across new domains, such as computational biology. As a step towards this goal, we formalize a notion of synonymous adversarial examples that applies in any discrete setting and describe a simple domain-agnostic algorithm to construct such examples. We apply this algorithm across multiple domains–including sentiment analysis and DNA sequence classification–and find that it consistently uncovers adversarial examples. We seek to understand their prevalence theoretically and we attribute their existence to spurious token correlations, a statistical phenomenon that is specific to discrete spaces. Our work is a step towards a domain-agnostic treatment of discrete adversarial examples analogous to that of continuous inputs.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/20/2017

Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods

Neural networks are known to be vulnerable to adversarial examples: inpu...
research
05/18/2021

On the Robustness of Domain Constraints

Machine learning is vulnerable to adversarial examples-inputs designed t...
research
10/25/2018

Evading classifiers in discrete domains with provable optimality guarantees

Security-critical applications such as malware, fraud, or spam detection...
research
02/21/2017

On the (Statistical) Detection of Adversarial Examples

Machine Learning (ML) models are applied in a variety of tasks such as n...
research
02/25/2020

Gödel's Sentence Is An Adversarial Example But Unsolvable

In recent years, different types of adversarial examples from different ...
research
10/07/2020

Not All Datasets Are Born Equal: On Heterogeneous Data and Adversarial Examples

Recent work on adversarial learning has focused mainly on neural network...
research
08/20/2019

Universal Adversarial Triggers for NLP

Adversarial examples highlight model vulnerabilities and are useful for ...

Please sign up or login with your details

Forgot password? Click here to reset