Experiments with adversarial attacks on text genres

07/05/2021
by   Mikhail Lepekhin, et al.
0

Neural models based on pre-trained transformers, such as BERT or XLM-RoBERTa, demonstrate SOTA results in many NLP tasks, including non-topical classification, such as genre identification. However, often these approaches exhibit low reliability to minor alterations of the test texts. A related probelm concerns topical biases in the training corpus, for example, the prevalence of words on a specific topic in a specific genre can trick the genre classifier to recognise any text on this topic in this genre. In order to mitigate the reliability problem, this paper investigates techniques for attacking genre classifiers to understand the limitations of the transformer models and to improve their performance. While simple text attacks, such as those based on word replacement using keywords extracted by tf-idf, are not capable of deceiving powerful models like XLM-RoBERTa, we show that embedding-based algorithms which can replace some of the most “significant” words with words similar to them, for example, TextFooler, have the ability to influence model predictions in a significant proportion of cases.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/15/2022

Estimating Confidence of Predictions of Individual Classifiers and Their Ensembles for the Genre Classification Task

Genre identification is a subclass of non-topical text classification. T...
research
04/16/2022

SimpleBERT: A Pre-trained Model That Learns to Generate Simple Words

Pre-trained models are widely used in the tasks of natural language proc...
research
08/11/2023

Identification of the Relevance of Comments in Codes Using Bag of Words and Transformer Based Models

The Forum for Information Retrieval (FIRE) started a shared task this ye...
research
09/11/2020

A Comparison of LSTM and BERT for Small Corpus

Recent advancements in the NLP field showed that transfer learning helps...
research
12/15/2022

Visually-augmented pretrained language models for NLP tasks without images

Although pre-trained language models (PLMs) have shown impressive perfor...
research
11/02/2022

Processing Long Legal Documents with Pre-trained Transformers: Modding LegalBERT and Longformer

Pre-trained Transformers currently dominate most NLP tasks. They impose,...
research
10/12/2021

SEPP: Similarity Estimation of Predicted Probabilities for Defending and Detecting Adversarial Text

There are two cases describing how a classifier processes input text, na...

Please sign up or login with your details

Forgot password? Click here to reset