Skip RNN: Learning to Skip State Updates in Recurrent Neural Networks

08/22/2017
by   Victor Campos, et al.
0

Recurrent Neural Networks (RNNs) continue to show outstanding performance in sequence modeling tasks. However, training RNNs on long sequences often face challenges like slow inference, vanishing gradients and difficulty in capturing long term dependencies. In backpropagation through time settings, these issues are tightly coupled with the large, sequential computational graph resulting from unfolding the RNN in time. We introduce the Skip RNN model which extends existing RNN models by learning to skip state updates and shortens the effective size of the computational graph. This model can also be encouraged to perform fewer state updates through a budget constraint. We evaluate the proposed model on various tasks and show how it can reduce the number of required RNN updates while preserving, and sometimes even improving, the performance of the baseline RNN models. Source code is publicly available at https://imatge-upc.github.io/skiprnn-2017-telecombcn/ .

READ FULL TEXT

page 4

page 8

research
10/05/2017

Dilated Recurrent Neural Networks

Learning with recurrent neural networks (RNNs) on long sequences is a no...
research
07/29/2019

RNNbow: Visualizing Learning via Backpropagation Gradients in Recurrent Neural Networks

We present RNNbow, an interactive tool for visualizing the gradient flow...
research
08/28/2023

Kernel Limit of Recurrent Neural Networks Trained on Ergodic Data Sequences

Mathematical methods are developed to characterize the asymptotics of re...
research
07/30/2020

Rethinking Recurrent Neural Networks and other Improvements for Image Classification

For a long history of Machine Learning which dates back to several decad...
research
06/11/2021

Piecewise-constant Neural ODEs

Neural networks are a popular tool for modeling sequential data but they...
research
03/11/2023

Resurrecting Recurrent Neural Networks for Long Sequences

Recurrent Neural Networks (RNNs) offer fast inference on long sequences ...
research
03/21/2018

Comparing Fixed and Adaptive Computation Time for Recurrent Neural Networks

Adaptive Computation Time for Recurrent Neural Networks (ACT) is one of ...

Please sign up or login with your details

Forgot password? Click here to reset