Exploring the Role of BERT Token Representations to Explain Sentence Probing Results

04/03/2021
by   Hosein Mohebbi, et al.
0

Several studies have been carried out on revealing linguistic features captured by BERT. This is usually achieved by training a diagnostic classifier on the representations obtained from different layers of BERT. The subsequent classification accuracy is then interpreted as the ability of the model in encoding the corresponding linguistic property. Despite providing insights, these studies have left out the potential role of token representations. In this paper, we provide an analysis on the representation space of BERT in search for distinct and meaningful subspaces that can explain probing results. Based on a set of probing tasks and with the help of attribution methods we show that BERT tends to encode meaningful knowledge in specific token representations (which are often ignored in standard classification setups), allowing the model to detect syntactic and semantic abnormalities, and to distinctively separate grammatical number and tense subspaces.

READ FULL TEXT

page 12

page 13

research
04/25/2023

What does BERT learn about prosody?

Language models have become nearly ubiquitous in natural language proces...
research
06/06/2019

Visualizing and Measuring the Geometry of BERT

Transformer architectures show significant promise for natural language ...
research
04/19/2022

Probing for the Usage of Grammatical Number

A central quest of probing is to uncover how pre-trained models encode a...
research
04/30/2020

How do Decisions Emerge across Layers in Neural Models? Interpretation with Differentiable Masking

Attribution methods assess the contribution of inputs (e.g., words) to t...
research
05/15/2021

The Low-Dimensional Linear Geometry of Contextualized Word Representations

Black-box probing models can reliably extract linguistic features like t...
research
06/04/2019

Open Sesame: Getting Inside BERT's Linguistic Knowledge

How and to what extent does BERT encode syntactically-sensitive hierarch...
research
11/24/2020

Picking BERT's Brain: Probing for Linguistic Dependencies in Contextualized Embeddings Using Representational Similarity Analysis

As the name implies, contextualized representations of language are typi...

Please sign up or login with your details

Forgot password? Click here to reset