Quantifying Long Range Dependence in Language and User Behavior to improve RNNs

05/23/2019
by   Francois Belletti, et al.
7

Characterizing temporal dependence patterns is a critical step in understanding the statistical properties of sequential data. Long Range Dependence (LRD) --- referring to long-range correlations decaying as a power law rather than exponentially w.r.t. distance --- demands a different set of tools for modeling the underlying dynamics of the sequential data. While it has been widely conjectured that LRD is present in language modeling and sequential recommendation, the amount of LRD in the corresponding sequential datasets has not yet been quantified in a scalable and model-independent manner. We propose a principled estimation procedure of LRD in sequential datasets based on established LRD theory for real-valued time series and apply it to sequences of symbols with million-item-scale dictionaries. In our measurements, the procedure estimates reliably the LRD in the behavior of users as they write Wikipedia articles and as they interact with YouTube. We further show that measuring LRD better informs modeling decisions in particular for RNNs whose ability to capture LRD is still an active area of research. The quantitative measure informs new Evolutive Recurrent Neural Networks (EvolutiveRNNs) designs, leading to state-of-the-art results on language understanding and sequential recommendation tasks at a fraction of the computational cost.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/22/2019

Towards Neural Mixture Recommender for Long Range Dependent User Sequences

Understanding temporal dynamics has proved to be highly valuable for acc...
research
04/08/2019

A Statistical Investigation of Long Memory in Language and Music

Representation and learning of long-range dependencies is a central chal...
research
05/10/2019

Mutual Information Scaling and Expressive Power of Sequence Models

Sequence models assign probabilities to variable-length sequences such a...
research
07/12/2020

On almost sure limit theorems for detecting long-range dependent, heavy-tailed processes

Marcinkiewicz strong law of large numbers, n^-1/p∑_k=1^n (d_k- d)→ 0 alm...
research
10/06/2018

Understanding Recurrent Neural Architectures by Analyzing and Synthesizing Long Distance Dependencies in Benchmark Sequential Datasets

At present, the state-of-the-art computational models across a range of ...
research
06/13/2022

EGRU: Event-based GRU for activity-sparse inference and learning

The scalability of recurrent neural networks (RNNs) is hindered by the s...
research
12/10/2014

Statistical Patterns in Written Language

Quantitative linguistics has been allowed, in the last few decades, with...

Please sign up or login with your details

Forgot password? Click here to reset