Mapping the Timescale Organization of Neural Language Models

12/12/2020
by   Hsiang-Yun Sherry Chien, et al.
10

In the human brain, sequences of language input are processed within a distributed and hierarchical architecture, in which higher stages of processing encode contextual information over longer timescales. In contrast, in recurrent neural networks which perform natural language processing, we know little about how the multiple timescales of contextual information are functionally organized. Therefore, we applied tools developed in neuroscience to map the "processing timescales" of individual units within a word-level LSTM language model. This timescale-mapping method assigned long timescales to units previously found to track long-range syntactic dependencies, and revealed a new cluster of previously unreported long-timescale units. Next, we explored the functional role of units by examining the relationship between their processing timescales and network connectivity. We identified two classes of long-timescale units: "Controller" units composed a densely interconnected subnetwork and strongly projected to the forget and input gates of the rest of the network, while "Integrator" units showed the longest timescales in the network, and expressed projection profiles closer to the mean projection profile. Ablating integrator and controller units affected model performance at different position of a sentence, suggesting distinctive functions of these two sets of units. Finally, we tested the generalization of these results to a character-level LSTM model. In summary, we demonstrated a model-free technique for mapping the timescale organization in neural network models, and we applied this method to reveal the timescale and functional organization of LSTM language models.

READ FULL TEXT

page 3

page 7

page 17

page 20

research
10/10/2018

Persistence pays off: Paying Attention to What the LSTM Gating Mechanism Persists

Language Models (LMs) are important components in several Natural Langua...
research
09/23/2019

Using Priming to Uncover the Organization of Syntactic Representations in Neural Language Models

Neural language models (LMs) perform well on tasks that require sensitiv...
research
06/19/2020

Exploring Processing of Nested Dependencies in Neural-Network Language Models and Humans

Recursive processing in sentence comprehension is considered a hallmark ...
research
09/27/2020

Multi-timescale representation learning in LSTM Language Models

Although neural language models are effective at capturing statistics of...
research
11/23/2018

Unsupervised Word Discovery with Segmental Neural Language Models

We propose a segmental neural language model that combines the represent...
research
03/18/2019

The emergence of number and syntax units in LSTM language models

Recent work has shown that LSTMs trained on a generic language modeling ...
research
04/27/2020

Word Interdependence Exposes How LSTMs Compose Representations

Recent work in NLP shows that LSTM language models capture compositional...

Please sign up or login with your details

Forgot password? Click here to reset