Visualizing and Understanding Recurrent Networks

06/05/2015
by   Andrej Karpathy, et al.
0

Recurrent Neural Networks (RNNs), and specifically a variant with Long Short-Term Memory (LSTM), are enjoying renewed interest as a result of successful applications in a wide range of machine learning problems that involve sequential data. However, while LSTMs provide exceptional results in practice, the source of their performance and their limitations remain rather poorly understood. Using character-level language models as an interpretable testbed, we aim to bridge this gap by providing an analysis of their representations, predictions and error types. In particular, our experiments reveal the existence of interpretable cells that keep track of long-range dependencies such as line lengths, quotes and brackets. Moreover, our comparative analysis with finite horizon n-gram models traces the source of the LSTM improvements to long-range structural dependencies. Finally, we provide analysis of the remaining errors and suggests areas for further study.

READ FULL TEXT
research
11/15/2018

Multi-cell LSTM Based Neural Language Model

Language models, being at the heart of many NLP problems, are always of ...
research
07/09/2020

Long Short-Term Memory Spiking Networks and Their Applications

Recent advances in event-based neuromorphic systems have resulted in sig...
research
05/17/2020

How much complexity does an RNN architecture need to learn syntax-sensitive dependencies?

Long short-term memory (LSTM) networks and their variants are capable of...
research
03/08/2017

Interpretable Structure-Evolving LSTM

This paper develops a general framework for learning interpretable data ...
research
05/25/2023

Online learning of long-range dependencies

Online learning holds the promise of enabling efficient long-term credit...
research
06/10/2015

Generative Image Modeling Using Spatial LSTMs

Modeling the distribution of natural images is challenging, partly becau...
research
10/17/2019

Probabilistic Deterministic Finite Automata and Recurrent Networks, Revisited

Reservoir computers (RCs) and recurrent neural networks (RNNs) can mimic...

Please sign up or login with your details

Forgot password? Click here to reset