Quantifying Robustness to Adversarial Word Substitutions

01/11/2022
by   Yuting Yang, et al.
0

Deep-learning-based NLP models are found to be vulnerable to word substitution perturbations. Before they are widely adopted, the fundamental issues of robustness need to be addressed. Along this line, we propose a formal framework to evaluate word-level robustness. First, to study safe regions for a model, we introduce robustness radius which is the boundary where the model can resist any perturbation. As calculating the maximum robustness radius is computationally hard, we estimate its upper and lower bound. We repurpose attack methods as ways of seeking upper bound and design a pseudo-dynamic programming algorithm for a tighter upper bound. Then verification method is utilized for a lower bound. Further, for evaluating the robustness of regions outside a safe radius, we reexamine robustness from another view: quantification. A robustness metric with a rigorous statistical guarantee is introduced to measure the quantification of adversarial examples, which indicates the model's susceptibility to perturbations outside the safe radius. The metric helps us figure out why state-of-the-art models like BERT can be easily fooled by a few word substitutions, but generalize well in the presence of real-world noises.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/01/2020

Assessing Robustness of Text Classification through Maximal Safe Radius Computation

Neural network NLP models are vulnerable to small modifications of the i...
research
04/21/2018

Gradient Masking Causes CLEVER to Overestimate Adversarial Perturbation Size

A key problem in research on adversarial examples is that vulnerability ...
research
05/20/2022

Getting a-Round Guarantees: Floating-Point Attacks on Certified Robustness

Adversarial examples pose a security risk as they can alter a classifier...
research
07/10/2018

A Game-Based Approximate Verification of Deep Neural Networks with Provable Guarantees

Despite the improved accuracy of deep neural networks, the discovery of ...
research
10/29/2021

ε-weakened Robustness of Deep Neural Networks

This paper introduces a notation of ε-weakened robustness for analyzing ...
research
06/05/2023

Evaluating robustness of support vector machines with the Lagrangian dual approach

Adversarial examples bring a considerable security threat to support vec...
research
12/05/2018

Are you tough enough? Framework for Robustness Validation of Machine Comprehension Systems

Deep Learning NLP domain lacks procedures for the analysis of model robu...

Please sign up or login with your details

Forgot password? Click here to reset