Intrinsic Probing through Dimension Selection

10/06/2020
by   Lucas Torroba Hennigen, et al.
0

Most modern NLP systems make use of pre-trained contextual representations that attain astonishingly high performance on a variety of tasks. Such high performance should not be possible unless some form of linguistic structure inheres in these representations, and a wealth of research has sprung up on probing for it. In this paper, we draw a distinction between intrinsic probing, which examines how linguistic information is structured within a representation, and the extrinsic probing popular in prior work, which only argues for the presence of such information by showing that it can be successfully extracted. To enable intrinsic probing, we propose a novel framework based on a decomposable multivariate Gaussian probe that allows us to determine whether the linguistic information in word embeddings is dispersed or focal. We then probe fastText and BERT for various morphosyntactic attributes across 36 languages. We find that most attributes are reliably encoded by only a few neurons, with fastText concentrating its linguistic structure more than BERT.

READ FULL TEXT
research
01/20/2022

A Latent-Variable Model for Intrinsic Probing

The success of pre-trained contextualized representations has prompted r...
research
10/14/2021

On the Pitfalls of Analyzing Individual Neurons in Language Models

While many studies have shown that linguistic information is encoded in ...
research
10/15/2021

Probing as Quantifying the Inductive Bias of Pre-trained Representations

Pre-trained contextual representations have led to dramatic performance ...
research
05/04/2020

A Tale of a Probe and a Parser

Measuring what linguistic information is encoded in neural models of lan...
research
05/15/2021

The Low-Dimensional Linear Geometry of Contextualized Word Representations

Black-box probing models can reliably extract linguistic features like t...
research
12/30/2020

Introducing Orthogonal Constraint in Structural Probes

With the recent success of pre-trained models in NLP, a significant focu...
research
04/08/2021

Probing BERT in Hyperbolic Spaces

Recently, a variety of probing tasks are proposed to discover linguistic...

Please sign up or login with your details

Forgot password? Click here to reset