Log In Sign Up

Fine-Tuned Transformers Show Clusters of Similar Representations Across Layers

by   Jason Phang, et al.

Despite the success of fine-tuning pretrained language encoders like BERT for downstream natural language understanding (NLU) tasks, it is still poorly understood how neural networks change after fine-tuning. In this work, we use centered kernel alignment (CKA), a method for comparing learned representations, to measure the similarity of representations in task-tuned models across layers. In experiments across twelve NLU tasks, we discover a consistent block diagonal structure in the similarity of representations within fine-tuned RoBERTa and ALBERT models, with strong similarity within clusters of earlier and later layers, but not between them. The similarity of later layer representations implies that later layers only marginally contribute to task performance, and we verify in experiments that the top few layers of fine-tuned Transformers can be discarded without hurting performance, even with no further tuning.


page 1

page 3

page 5

page 10


What Would Elsa Do? Freezing Layers During Transformer Fine-Tuning

Pretrained transformer-based language models have achieved state of the ...

Visualizing Deep Similarity Networks

For convolutional neural network models that optimize an image embedding...

AutoFreeze: Automatically Freezing Model Blocks to Accelerate Fine-tuning

With the rapid adoption of machine learning (ML), a number of domains no...

BERTnesia: Investigating the capture and forgetting of knowledge in BERT

Probing complex language models has recently revealed several insights i...

CoreLM: Coreference-aware Language Model Fine-Tuning

Language Models are the underpin of all modern Natural Language Processi...

Fine-tuning Pretrained Language Models with Label Attention for Explainable Biomedical Text Classification

The massive growth of digital biomedical data is making biomedical text ...

Similarity Analysis of Contextual Word Representation Models

This paper investigates contextual word representation models from the l...