The Low-Dimensional Linear Geometry of Contextualized Word Representations

05/15/2021
by   Evan Hernandez, et al.
0

Black-box probing models can reliably extract linguistic features like tense, number, and syntactic role from pretrained word representations. However, the manner in which these features are encoded in representations remains poorly understood. We present a systematic study of the linear geometry of contextualized word representations in ELMO and BERT. We show that a variety of linguistic features (including structured dependency relationships) are encoded in low-dimensional subspaces. We then refine this geometric picture, showing that there are hierarchical relations between the subspaces encoding general linguistic categories and more specific ones, and that low-dimensional feature encodings are distributed rather than aligned to individual neurons. Finally, we demonstrate that these linear subspaces are causally related to model behavior, and can be used to perform fine-grained manipulation of BERT's output distribution.

READ FULL TEXT
06/06/2019

Visualizing and Measuring the Geometry of BERT

Transformer architectures show significant promise for natural language ...
11/24/2015

Convergent Learning: Do different neural networks learn the same representations?

Recent success in training deep neural networks have prompted active inv...
04/03/2021

Exploring the Role of BERT Token Representations to Explain Sentence Probing Results

Several studies have been carried out on revealing linguistic features c...
10/06/2020

Intrinsic Probing through Dimension Selection

Most modern NLP systems make use of pre-trained contextual representatio...
06/18/2022

Pursuit of a Discriminative Representation for Multiple Subspaces via Sequential Games

We consider the problem of learning discriminative representations for d...
05/14/2021

Counterfactual Interventions Reveal the Causal Effect of Relative Clause Representations on Agreement Prediction

When language models process syntactically complex sentences, do they us...
04/29/2020

Asking without Telling: Exploring Latent Ontologies in Contextual Representations

The success of pretrained contextual encoders, such as ELMo and BERT, ha...