What Happens To BERT Embeddings During Fine-tuning?

04/29/2020
by   Amil Merchant, et al.
0

While there has been much recent work studying how linguistic information is encoded in pre-trained sentence representations, comparatively little is understood about how these models change when adapted to solve downstream tasks. Using a suite of analysis techniques (probing classifiers, Representational Similarity Analysis, and model ablations), we investigate how fine-tuning affects the representations of the BERT model. We find that while fine-tuning necessarily makes significant changes, it does not lead to catastrophic forgetting of linguistic phenomena. We instead find that fine-tuning primarily affects the top layers of BERT, but with noteworthy variation across tasks. In particular, dependency parsing reconfigures most of the model, whereas SQuAD and MNLI appear to involve much shallower processing. Finally, we also find that fine-tuning has a weaker effect on representations of out-of-domain sentences, suggesting room for improvement in model generalization.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/06/2020

On the Interplay Between Fine-tuning and Sentence-level Probing for Linguistic Knowledge in Pre-trained Transformers

Fine-tuning pre-trained contextualized embedding models has become an in...
research
06/27/2021

A Closer Look at How Fine-tuning Changes BERT

Given the prevalence of pre-trained contextualized representations in to...
research
01/27/2021

On the Evolution of Syntactic Information Encoded by BERT's Contextualized Representations

The adaptation of pretrained language models to solve supervised tasks h...
research
06/02/2020

A Pairwise Probe for Understanding BERT Fine-Tuning on Machine Reading Comprehension

Pre-trained models have brought significant improvements to many NLP tas...
research
10/05/2020

Linguistic Profiling of a Neural Language Model

In this paper we investigate the linguistic knowledge learned by a Neura...
research
09/13/2021

Not All Models Localize Linguistic Knowledge in the Same Place: A Layer-wise Probing on BERToids' Representations

Most of the recent works on probing representations have focused on BERT...
research
05/03/2020

Similarity Analysis of Contextual Word Representation Models

This paper investigates contextual word representation models from the l...

Please sign up or login with your details

Forgot password? Click here to reset