Echo State Speech Recognition

02/18/2021
by   Harsh Shrivastava, et al.
11

We propose automatic speech recognition (ASR) models inspired by echo state network (ESN), in which a subset of recurrent neural networks (RNN) layers in the models are randomly initialized and untrained. Our study focuses on RNN-T and Conformer models, and we show that model quality does not drop even when the decoder is fully randomized. Furthermore, such models can be trained more efficiently as the decoders do not require to be updated. By contrast, randomizing encoders hurts model quality, indicating that optimizing encoders and learn proper representations for acoustic inputs are more vital for speech recognition. Overall, we challenge the common practice of training ASR models for all components, and demonstrate that ESN-based models can perform equally well but enable more efficient training and storage than fully-trainable counterparts.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/14/2016

Recurrent Deep Stacking Networks for Speech Recognition

This paper presented our work on applying Recurrent Deep Stacking Networ...
research
06/16/2021

Collaborative Training of Acoustic Encoders for Speech Recognition

On-device speech recognition requires training models of different sizes...
research
12/31/2019

EEG based Continuous Speech Recognition using Transformers

In this paper we investigate continuous speech recognition using electro...
research
12/11/2020

Improved Robustness to Disfluencies in RNN-Transducer Based Speech Recognition

Automatic Speech Recognition (ASR) based on Recurrent Neural Network Tra...
research
09/15/2023

Augmenting conformers with structured state space models for online speech recognition

Online speech recognition, where the model only accesses context to the ...
research
08/13/2020

LSTM Acoustic Models Learn to Align and Pronounce with Graphemes

Automated speech recognition coverage of the world's languages continues...
research
02/19/2020

Gradient-Adjusted Neuron Activation Profiles for Comprehensive Introspection of Convolutional Speech Recognition Models

Deep Learning based Automatic Speech Recognition (ASR) models are very s...

Please sign up or login with your details

Forgot password? Click here to reset