On Robustness of Finetuned Transformer-based NLP Models

Transformer-based pretrained models like BERT, GPT-2 and T5 have been finetuned for a large number of natural language processing (NLP) tasks, and have been shown to be very effective. However, while finetuning, what changes across layers in these models with respect to pretrained checkpoints is under-studied. Further, how robust are these models to perturbations in input text? Does the robustness vary depending on the NLP task for which the models have been finetuned? While there exists some work on studying robustness of BERT finetuned for a few NLP tasks, there is no rigorous study which compares this robustness across encoder only, decoder only and encoder-decoder models. In this paper, we study the robustness of three language models (BERT, GPT-2 and T5) with eight different text perturbations on the General Language Understanding Evaluation (GLUE) benchmark. Also, we use two metrics (CKA and STIR) to quantify changes between pretrained and finetuned language model representations across layers. GPT-2 representations are more robust than BERT and T5 across multiple types of input perturbation. Although models exhibit good robustness broadly, dropping nouns, verbs or changing characters are the most impactful. Overall, this study provides valuable insights into perturbation-specific weaknesses of popular Transformer-based models which should be kept in mind when passing inputs.

READ FULL TEXT

page 14

page 15

page 16

research
08/27/2021

Evaluating the Robustness of Neural Language Models to Input Perturbations

High-performance neural language models have obtained state-of-the-art r...
research
10/10/2020

Information Extraction from Swedish Medical Prescriptions with Sig-Transformer Encoder

Relying on large pretrained language models such as Bidirectional Encode...
research
09/28/2021

Shaking Syntactic Trees on the Sesame Street: Multilingual Probing with Controllable Perturbations

Recent research has adopted a new experimental field centered around the...
research
03/09/2023

On the Robustness of Text Vectorizers

A fundamental issue in natural language processing is the robustness of ...
research
07/14/2020

What's in a Name? Are BERT Named Entity Representations just as Good for any other Name?

We evaluate named entity representations of BERT-based NLP models by inv...
research
10/05/2020

PUM at SemEval-2020 Task 12: Aggregation of Transformer-based models' features for offensive language recognition

In this paper, we describe the PUM team's entry to the SemEval-2020 Task...
research
04/08/2022

Improving Tokenisation by Alternative Treatment of Spaces

Tokenisation is the first step in almost all NLP tasks, and state-of-the...

Please sign up or login with your details

Forgot password? Click here to reset