Using Paraphrases to Study Properties of Contextual Embeddings

07/12/2022
by   Laura Burdick, et al.
4

We use paraphrases as a unique source of data to analyze contextualized embeddings, with a particular focus on BERT. Because paraphrases naturally encode consistent word and phrase semantics, they provide a unique lens for investigating properties of embeddings. Using the Paraphrase Database's alignments, we study words within paraphrases as well as phrase representations. We find that contextual embeddings effectively handle polysemous words, but give synonyms surprisingly different representations in many cases. We confirm previous findings that BERT is sensitive to word order, but find slightly different patterns than prior work in terms of the level of contextualization across BERT's layers.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/02/2019

How Contextual are Contextualized Word Representations? Comparing the Geometry of BERT, ELMo, and GPT-2 Embeddings

Replacing static word embeddings with contextualized word representation...
research
01/07/2021

Homonym Identification using BERT – Using a Clustering Approach

Homonym identification is important for WSD that require coarse-grained ...
research
05/18/2020

Contextual Embeddings: When Are They Worth It?

We study the settings for which deep contextual embeddings (e.g., BERT) ...
research
11/08/2019

Towards Hierarchical Importance Attribution: Explaining Compositional Semantics for Neural Sequence Models

The impressive performance of neural networks on natural language proces...
research
03/05/2020

BERT as a Teacher: Contextual Embeddings for Sequence-Level Reward

Measuring the quality of a generated sequence against a set of reference...
research
09/13/2021

Phrase-BERT: Improved Phrase Embeddings from BERT with an Application to Corpus Exploration

Phrase representations derived from BERT often do not exhibit complex ph...
research
06/04/2019

Open Sesame: Getting Inside BERT's Linguistic Knowledge

How and to what extent does BERT encode syntactically-sensitive hierarch...

Please sign up or login with your details

Forgot password? Click here to reset