An information theoretic view on selecting linguistic probes

09/15/2020
by   Zining Zhu, et al.
0

There is increasing interest in assessing the linguistic knowledge encoded in neural representations. A popular approach is to attach a diagnostic classifier – or "probe" – to perform supervised classification from internal representations. However, how to select a good probe is in debate. Hewitt and Liang (2019) showed that a high performance on diagnostic classification itself is insufficient, because it can be attributed to either "the representation being rich in knowledge", or "the probe learning the task", which Pimentel et al. (2020) challenged. We show this dichotomy is valid information-theoretically. In addition, we find that the methods to construct and select good probes proposed by the two papers, *control task* (Hewitt and Liang, 2019) and *control function* (Pimentel et al., 2020), are equivalent – the errors of their approaches are identical (modulo irrelevant terms). Empirically, these two selection criteria lead to results that highly agree with each other.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/06/2021

Bird's Eye: Probing for Linguistic Graph Structures with a Simple Information-Theoretic Approach

NLP has a rich history of representing our prior understanding of langua...
research
04/07/2020

Information-Theoretic Probing for Linguistic Structure

The success of neural networks on a diverse set of NLP tasks has led res...
research
04/12/2021

Does My Representation Capture X? Probe-Ably

Probing (or diagnostic classification) has become a popular strategy for...
research
09/08/2019

Designing and Interpreting Probes with Control Tasks

Probes, supervised models trained to predict properties (like parts-of-s...
research
05/11/2021

Improved LCAs for constructing spanners

In this paper we study the problem of constructing spanners in a local m...
research
03/27/2020

Information-Theoretic Probing with Minimum Description Length

To measure how well pretrained representations encode some linguistic pr...
research
06/09/2019

Encouraging Paragraph Embeddings to Remember Sentence Identity Improves Classification

While paragraph embedding models are remarkably effective for downstream...

Please sign up or login with your details

Forgot password? Click here to reset