Diagonal State Space Augmented Transformers for Speech Recognition

02/27/2023
by   George Saon, et al.
0

We improve on the popular conformer architecture by replacing the depthwise temporal convolutions with diagonal state space (DSS) models. DSS is a recently introduced variant of linear RNNs obtained by discretizing a linear dynamical system with a diagonal state transition matrix. DSS layers project the input sequence onto a space of orthogonal polynomials where the choice of basis functions, metric and support is controlled by the eigenvalues of the transition matrix. We compare neural transducers with either conformer or our proposed DSS-augmented transformer (DSSformer) encoders on three public corpora: Switchboard English conversational telephone speech 300 hours, Switchboard+Fisher 2000 hours, and a spoken archive of holocaust survivor testimonials called MALACH 176 hours. On Switchboard 300/2000 hours, we reach a single model performance of 8.9 2000 evaluation, respectively, and on MALACH we improve the WER by 7 over the previous best published result. In addition, we present empirical evidence suggesting that DSS layers learn damped Fourier basis functions where the attenuation coefficients are layer specific whereas the frequency coefficients converge to almost identical linearly-spaced values across all layers.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/26/2022

Liquid Structural State-Space Models

A proper parametrization of state transition matrices of linear state-sp...
research
06/23/2022

On the Parameterization and Initialization of Diagonal State Space Models

State space models (SSM) have recently been shown to be very effective a...
research
07/27/2022

Fast expansion into harmonics on the disk: a steerable basis with fast radial convolutions

We present a fast and numerically accurate method for expanding digitize...
research
05/21/2023

Multi-Head State Space Model for Speech Recognition

State space models (SSMs) have recently shown promising results on small...
research
06/24/2022

How to Train Your HiPPO: State Space Models with Generalized Orthogonal Basis Projections

Linear time-invariant state space models (SSM) are a classical model fro...
research
12/01/2022

Simplifying and Understanding State Space Models with Diagonal Linear RNNs

Sequence models based on linear state spaces (SSMs) have recently emerge...
research
05/30/2019

A Block Diagonal Markov Model for Indoor Software-Defined Power Line Communication

A Semi-Hidden Markov Model (SHMM) for bursty error channels is defined b...

Please sign up or login with your details

Forgot password? Click here to reset