The geometry of integration in text classification RNNs

10/28/2020
by   Kyle Aitken, et al.
13

Despite the widespread application of recurrent neural networks (RNNs) across a variety of tasks, a unified understanding of how RNNs solve these tasks remains elusive. In particular, it is unclear what dynamical patterns arise in trained RNNs, and how those patterns depend on the training dataset or task. This work addresses these questions in the context of a specific natural language processing task: text classification. Using tools from dynamical systems analysis, we study recurrent networks trained on a battery of both natural and synthetic text classification tasks. We find the dynamics of these trained RNNs to be both interpretable and low-dimensional. Specifically, across architectures and datasets, RNNs accumulate evidence for each class as they process the text, using a low-dimensional attractor manifold as the underlying mechanism. Moreover, the dimensionality and geometry of the attractor manifold are determined by the structure of the training dataset; in particular, we describe how simple word-count statistics computed on the training dataset can be used to predict these properties. Our observations span multiple architectures and datasets, reflecting a common mechanism RNNs employ to perform text classification. To the degree that integration of evidence towards a decision is a common computational primitive, this work lays the foundation for using dynamical systems techniques to study the inner workings of RNNs.

READ FULL TEXT

page 9

page 10

page 12

page 19

page 23

page 24

page 26

page 28

research
06/25/2019

Reverse engineering recurrent networks for sentiment classification reveals line attractor dynamics

Recurrent neural networks (RNNs) are a widely used tool for modeling seq...
research
09/28/2018

Learning Robust, Transferable Sentence Representations for Text Classification

Despite deep recurrent neural networks (RNNs) demonstrate strong perform...
research
11/23/2018

Explicit Interaction Model towards Text Classification

Text classification is one of the fundamental tasks in natural language ...
research
08/16/2017

Deconvolutional Paragraph Representation Learning

Learning latent representations from long text sequences is an important...
research
03/02/2021

On the Memory Mechanism of Tensor-Power Recurrent Models

Tensor-power (TP) recurrent model is a family of non-linear dynamical sy...
research
06/25/2020

On Lyapunov Exponents for RNNs: Understanding Information Propagation Using Dynamical Systems Tools

Recurrent neural networks (RNNs) have been successfully applied to a var...
research
07/06/2022

Tractable Dendritic RNNs for Reconstructing Nonlinear Dynamical Systems

In many scientific disciplines, we are interested in inferring the nonli...

Please sign up or login with your details

Forgot password? Click here to reset