Putting Words in BERT's Mouth: Navigating Contextualized Vector Spaces with Pseudowords

09/23/2021
by   Taelin Karidi, et al.
0

We present a method for exploring regions around individual points in a contextualized vector space (particularly, BERT space), as a way to investigate how these regions correspond to word senses. By inducing a contextualized "pseudoword" as a stand-in for a static embedding in the input layer, and then performing masked prediction of a word in the sentence, we are able to investigate the geometry of the BERT-space in a controlled manner around individual instances. Using our method on a set of carefully constructed sentences targeting ambiguous English words, we find substantial regularity in the contextualized space, with regions that correspond to distinct word senses; but between these regions there are occasionally "sense voids" – regions that do not correspond to any intelligible sense.

READ FULL TEXT
research
06/27/2019

Inducing Syntactic Trees from BERT Representations

We use the English model of BERT and explore how a deletion of one word ...
research
09/23/2019

Does BERT Make Any Sense? Interpretable Word Sense Disambiguation with Contextualized Embeddings

Contextualized word embeddings (CWE) such as provided by ELMo (Peters et...
research
09/18/2019

Using BERT for Word Sense Disambiguation

Word Sense Disambiguation (WSD), which aims to identify the correct sens...
research
05/27/2021

RAW-C: Relatedness of Ambiguous Words–in Context (A New Lexical Resource for English)

Most words are ambiguous–i.e., they convey distinct meanings in differen...
research
06/02/2023

Driving Context into Text-to-Text Privatization

Metric Differential Privacy enables text-to-text privatization by adding...
research
02/18/2016

Entity Embeddings with Conceptual Subspaces as a Basis for Plausible Reasoning

Conceptual spaces are geometric representations of conceptual knowledge,...
research
11/18/2020

Topology of Word Embeddings: Singularities Reflect Polysemy

The manifold hypothesis suggests that word vectors live on a submanifold...

Please sign up or login with your details

Forgot password? Click here to reset