Latent Topology Induction for Understanding Contextualized Representations

06/03/2022
by   Yao Fu, et al.
0

In this work, we study the representation space of contextualized embeddings and gain insight into the hidden topology of large language models. We show there exists a network of latent states that summarize linguistic properties of contextualized representations. Instead of seeking alignments to existing well-defined annotations, we infer this latent network in a fully unsupervised way using a structured variational autoencoder. The induced states not only serve as anchors that mark the topology (neighbors and connectivity) of the representation manifold but also reveal the internal mechanism of encoding sentences. With the induced network, we: (1). decompose the representation space into a spectrum of latent states which encode fine-grained word meanings with lexical, morphological, syntactic and semantic information; (2). show state-state transitions encode rich phrase constructions and serve as the backbones of the latent space. Putting the two together, we show that sentences are represented as a traversal over the latent network where state-state transition chains encode syntactic templates and state-word emissions fill in the content. We demonstrate these insights with extensive experiments and visualizations.

READ FULL TEXT
research
11/19/2015

Generating Sentences from a Continuous Space

The standard recurrent neural network language model (RNNLM) generates s...
research
05/17/2019

Dueling Decoders: Regularizing Variational Autoencoder Latent Spaces

Variational autoencoders learn unsupervised data representations, but th...
research
08/31/2015

Word Representations, Tree Models and Syntactic Functions

Word representations induced from models with discrete latent variables ...
research
07/06/2019

Generating Sentences from Disentangled Syntactic and Semantic Spaces

Variational auto-encoders (VAEs) are widely used in natural language gen...
research
06/22/2022

Towards Unsupervised Content Disentanglement in Sentence Representations via Syntactic Roles

Linking neural representations to linguistic factors is crucial in order...
research
12/09/2021

Latent Space Explanation by Intervention

The success of deep neural nets heavily relies on their ability to encod...
research
07/25/2022

Homomorphism Autoencoder – Learning Group Structured Representations from Observed Transitions

How can we acquire world models that veridically represent the outside w...

Please sign up or login with your details

Forgot password? Click here to reset