Understanding Recurrent Neural Architectures by Analyzing and Synthesizing Long Distance Dependencies in Benchmark Sequential Datasets

by   Abhijit Mahalunkar, et al.

At present, the state-of-the-art computational models across a range of sequential data processing tasks, including language modeling, are based on recurrent neural network architectures. This paper begins with the observation that most research on developing computational models capable of processing sequential data fails to explicitly analyze the long distance dependencies (LDDs) within the datasets the models process. In this context, in this paper, we make five research contributions. First, we argue that a key step in modeling sequential data is to understand the characteristics of the LDDs within the data. Second, we present a method to compute and analyze the LDD characteristics of any sequential dataset, and demonstrate this method on a number of sequential datasets that are frequently used for model benchmarking. Third, based on the analysis of the LDD characteristics within the benchmarking datasets, we observe that LDDs are far more complex than previously assumed, and depend on at least four factors: (i) the number of unique symbols in a dataset, (ii) size of the dataset, (iii) the number of interacting symbols within an LDD, and (iv) the distance between the interacting symbols. Fourth, we verify these factors by using synthetic datasets generated using Strictly k-Piecewise (SPk) languages. We then demonstrate how SPk languages can be used to generate benchmarking datasets with varying degrees of LDDs. The advantage of these synthesized datasets being that they enable the targeted testing of recurrent neural architectures. Finally, we demonstrate how understanding the characteristics of the LDDs in a dataset can inform better hyper-parameter selection for current state-of-the-art recurrent neural architectures and also aid in understanding them...


page 1

page 2

page 3

page 4


Multi-Element Long Distance Dependencies: Using SPk Languages to Explore the Characteristics of Long-Distance Dependencies

In order to successfully model Long Distance Dependencies (LDDs) it is n...

Using Regular Languages to Explore the Representational Capacity of Recurrent Neural Architectures

The presence of Long Distance Dependencies (LDDs) in sequential data pos...

Mutual Information Decay Curves and Hyper-Parameter Grid Search Design for Recurrent Neural Architectures

We present an approach to design the grid searches for hyper-parameter o...

Quantifying Long Range Dependence in Language and User Behavior to improve RNNs

Characterizing temporal dependence patterns is a critical step in unders...

Oscillatory Fourier Neural Network: A Compact and Efficient Architecture for Sequential Processing

Tremendous progress has been made in sequential processing with the rece...

DeepZip: Lossless Data Compression using Recurrent Neural Networks

Sequential data is being generated at an unprecedented pace in various f...

Fourier RNNs for Sequence Analysis and Prediction

Fourier methods have a long and proven track record in as an excellent t...

Please sign up or login with your details

Forgot password? Click here to reset