Asking without Telling: Exploring Latent Ontologies in Contextual Representations

04/29/2020
by   Julian Michael, et al.
0

The success of pretrained contextual encoders, such as ELMo and BERT, has brought a great deal of interest in what these models learn: do they, without explicit supervision, learn to encode meaningful notions of linguistic structure? If so, how is this structure encoded? To investigate this, we introduce latent subclass learning (LSL): a modification to existing classifier-based probing methods that induces a latent categorization (or ontology) of the probe's inputs. Without access to fine-grained gold labels, LSL extracts emergent structure from input representations in an interpretable and quantifiable form. In experiments, we find strong evidence of familiar categories, such as a notion of personhood in ELMo, as well as novel ontological distinctions, such as a preference for fine-grained semantic roles on core arguments. Our results provide unique new evidence of emergent structure in pretrained encoders, including departures from existing annotations which are inaccessible to earlier methods.

READ FULL TEXT

page 16

page 17

research
01/11/2019

Grammatical Analysis of Pretrained Sentence Encoders with Acceptability Judgments

Recent pretrained sentence encoders achieve state of the art results on ...
research
05/04/2020

Spying on your neighbors: Fine-grained probing of contextual embeddings for information about surrounding words

Although models using contextual word embeddings have achieved state-of-...
research
07/21/2022

Exploring Fine-Grained Audiovisual Categorization with the SSW60 Dataset

We present a new benchmark dataset, Sapsucker Woods 60 (SSW60), for adva...
research
10/05/2022

Relational Proxies: Emergent Relationships as Fine-Grained Discriminators

Fine-grained categories that largely share the same set of parts cannot ...
research
02/06/2020

Fully-hierarchical fine-grained prosody modeling for interpretable speech synthesis

This paper proposes a hierarchical, fine-grained and interpretable laten...
research
05/15/2021

The Low-Dimensional Linear Geometry of Contextualized Word Representations

Black-box probing models can reliably extract linguistic features like t...
research
06/04/2021

Great Service! Fine-grained Parsing of Implicit Arguments

Broad-coverage meaning representations in NLP mostly focus on explicitly...

Please sign up or login with your details

Forgot password? Click here to reset