A Fully Tensorized Recurrent Neural Network

10/08/2020
by   Charles C Onu, et al.
0

Recurrent neural networks (RNNs) are powerful tools for sequential modeling, but typically require significant overparameterization and regularization to achieve optimal performance. This leads to difficulties in the deployment of large RNNs in resource-limited settings, while also introducing complications in hyperparameter selection and training. To address these issues, we introduce a "fully tensorized" RNN architecture which jointly encodes the separate weight matrices within each recurrent cell using a lightweight tensor-train (TT) factorization. This approach represents a novel form of weight sharing which reduces model size by several orders of magnitude, while still maintaining similar or better performance compared to standard RNNs. Experiments on image classification and speaker verification tasks demonstrate further benefits for reducing inference times and stabilizing model training and hyperparameter selection.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/06/2018

Sliced Recurrent Neural Networks

Recurrent neural networks have achieved great success in many NLP tasks....
research
04/13/2016

Bridging the Gaps Between Residual Learning, Recurrent Neural Networks and Visual Cortex

We discuss relations between Residual Networks (ResNet), Recurrent Neura...
research
07/06/2017

Tensor-Train Recurrent Neural Networks for Video Classification

The Recurrent Neural Networks and their variants have shown promising pe...
research
11/09/2018

Complex Unitary Recurrent Neural Networks using Scaled Cayley Transform

Recurrent neural networks (RNNs) have been successfully used on a wide r...
research
06/07/2019

Relaxed Weight Sharing: Effectively Modeling Time-Varying Relationships in Clinical Time-Series

Recurrent neural networks (RNNs) are commonly applied to clinical time-s...
research
04/18/2017

Diagonal RNNs in Symbolic Music Modeling

In this paper, we propose a new Recurrent Neural Network (RNN) architect...
research
03/02/2021

On the Memory Mechanism of Tensor-Power Recurrent Models

Tensor-power (TP) recurrent model is a family of non-linear dynamical sy...

Please sign up or login with your details

Forgot password? Click here to reset