Similarity Analysis of Contextual Word Representation Models

05/03/2020
by   John M. Wu, et al.
0

This paper investigates contextual word representation models from the lens of similarity analysis. Given a collection of trained models, we measure the similarity of their internal representations and attention. Critically, these models come from vastly different architectures. We use existing and novel similarity measures that aim to gauge the level of localization of information in the deep models, and facilitate the investigation of which design factors affect model similarity, without requiring any external linguistic annotation. The analysis reveals that models within the same family are more similar to one another, as may be expected. Surprisingly, different architectures have rather similar representations, but different individual neurons. We also observed differences in information localization in lower and higher layers and found that higher layers are more affected by fine-tuning on downstream tasks.

READ FULL TEXT

page 4

page 5

page 7

page 8

page 16

page 17

page 18

research
06/27/2022

Linguistic Correlation Analysis: Discovering Salient Neurons in deepNLP models

While a lot of work has been done in understanding representations learn...
research
06/30/2023

What do self-supervised speech models know about words?

Many self-supervised speech models (S3Ms) have been introduced over the ...
research
04/29/2020

What Happens To BERT Embeddings During Fine-tuning?

While there has been much recent work studying how linguistic informatio...
research
09/17/2021

Fine-Tuned Transformers Show Clusters of Similar Representations Across Layers

Despite the success of fine-tuning pretrained language encoders like BER...
research
03/29/2023

ContraSim – A Similarity Measure Based on Contrastive Learning

Recent work has compared neural network representations via similarity-b...
research
04/25/2023

Objectives Matter: Understanding the Impact of Self-Supervised Objectives on Vision Transformer Representations

Joint-embedding based learning (e.g., SimCLR, MoCo, DINO) and reconstruc...
research
05/30/2023

Pointwise Representational Similarity

With the increasing reliance on deep neural networks, it is important to...

Please sign up or login with your details

Forgot password? Click here to reset