SIENA: Stochastic Multi-Expert Neural Patcher

11/17/2020
by   Thai Le, et al.
0

Neural network (NN) models that are solely trained to maximize the likelihood of an observed dataset are often vulnerable to adversarial attacks. Even though several methods have been proposed to enhance NN models' adversarial robustness, they often require re-training from scratch. This leads to redundant computation, especially in the NLP domain where current state-of-the-art models, such as BERT and ROBERTA, require great time and space resources. By borrowing ideas from Software Engineering, we, therefore, first introduce the Neural Patching mechanism to improve adversarial robustness by "patching" only parts of a NN model. Then, we propose a novel neural patching algorithm, SIENA, that transforms a textual NN model into a stochastic ensemble of multi-expert predictors by upgrading and re-training its last layer only. SIENA forces adversaries to attack not only one but multiple models that are specialized in diverse sub-sets of features, labels, and instances so that the ensemble model becomes more robust to adversarial attacks. By conducting comprehensive experiments, we demonstrate that all of CNN, RNN, BERT, and ROBERTA-based textual models, once patched by SIENA, witness an absolute increase of as much as 20 black-box attacks, outperforming 6 defensive baselines across 4 public NLP datasets.

READ FULL TEXT

page 3

page 12

research
10/31/2022

Scoring Black-Box Models for Adversarial Robustness

Deep neural networks are susceptible to adversarial inputs and various m...
research
11/20/2020

Detecting Universal Trigger's Adversarial Attack with Honeypot

The Universal Trigger (UniTrigger) is a recently-proposed powerful adver...
research
01/21/2021

Adv-OLM: Generating Textual Adversaries via OLM

Deep learning models are susceptible to adversarial examples that have i...
research
07/10/2020

Generating Adversarial Inputs Using A Black-box Differential Technique

Neural Networks (NNs) are known to be vulnerable to adversarial attacks....
research
10/06/2021

Adversarial Attacks on Machinery Fault Diagnosis

Despite the great progress of neural network-based (NN-based) machinery ...
research
03/27/2021

Ensemble-in-One: Learning Ensemble within Random Gated Networks for Enhanced Adversarial Robustness

Adversarial attacks have rendered high security risks on modern deep lea...
research
10/14/2021

Identifying and Mitigating Spurious Correlations for Improving Robustness in NLP Models

Recently, NLP models have achieved remarkable progress across a variety ...

Please sign up or login with your details

Forgot password? Click here to reset