Log In Sign Up

Surprisal-Triggered Conditional Computation with Neural Networks

by   Loren Lugosch, et al.

Autoregressive neural network models have been used successfully for sequence generation, feature extraction, and hypothesis scoring. This paper presents yet another use for these models: allocating more computation to more difficult inputs. In our model, an autoregressive model is used both to extract features and to predict observations in a stream of input observations. The surprisal of the input, measured as the negative log-likelihood of the current observation according to the autoregressive model, is used as a measure of input difficulty. This in turn determines whether a small, fast network, or a big, slow network, is used. Experiments on two speech recognition tasks show that our model can match the performance of a baseline in which the big network is always used with 15


page 1

page 2

page 3

page 4


Generalized Autoregressive Neural Network Models

A time series is a sequence of observations taken sequentially in time. ...

Dereverberation of Autoregressive Envelopes for Far-field Speech Recognition

The task of speech recognition in far-field environments is adversely af...

Global Autoregressive Models for Data-Efficient Sequence Learning

Standard autoregressive seq2seq models are easily trained by max-likelih...

The DEformer: An Order-Agnostic Distribution Estimating Transformer

Order-agnostic autoregressive distribution estimation (OADE), i.e., auto...

Imputer: Sequence Modelling via Imputation and Dynamic Programming

This paper presents the Imputer, a neural sequence model that generates ...

Fast Generation for Convolutional Autoregressive Models

Convolutional autoregressive models have recently demonstrated state-of-...