Combating Adversarial Misspellings with Robust Word Recognition

05/27/2019
by   Danish Pruthi, et al.
0

To combat adversarial spelling mistakes, we propose placing a word recognition model in front of the downstream classifier. Our word recognition models build upon the RNN semi-character architecture, introducing several new backoff strategies for handling rare and unseen words. Trained to recognize words corrupted by random adds, drops, swaps, and keyboard mistakes, our method achieves 32 semi-character model. Notably, our pipeline confers robustness on the downstream classifier, outperforming both adversarial training and off-the-shelf spell checkers. Against a BERT model fine-tuned for sentiment analysis, a single adversarially-chosen character attack lowers accuracy from 90.3 recognition does not always entail greater robustness. Our analysis reveals that robustness also depends upon a quantity that we denote the sensitivity.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/09/2019

Who Needs Words? Lexicon-Free Speech Recognition

Lexicon-free speech recognition naturally deals with the problem of out-...
research
04/21/2020

BERT-ATTACK: Adversarial Attack Against BERT Using BERT

Adversarial attacks for discrete data (such as text) has been proved sig...
research
08/07/2016

Robsut Wrod Reocginiton via semi-Character Recurrent Neural Network

Language processing mechanism by humans is generally more robust than co...
research
09/03/2023

A Visual Interpretation-Based Self-Improved Classification System Using Virtual Adversarial Training

The successful application of large pre-trained models such as BERT in n...
research
02/11/2022

White-Box Attacks on Hate-speech BERT Classifiers in German with Explicit and Implicit Character Level Defense

In this work, we evaluate the adversarial robustness of BERT models trai...
research
01/15/2021

Motion-Based Handwriting Recognition and Word Reconstruction

In this project, we leverage a trained single-letter classifier to predi...
research
02/01/2016

Efficient Character-level Document Classification by Combining Convolution and Recurrent Layers

Document classification tasks were primarily tackled at word level. Rece...

Please sign up or login with your details

Forgot password? Click here to reset