DeepAI
Log In Sign Up

Efficient End-to-End Speech Recognition Using Performers in Conformers

11/09/2020
by   Peidong Wang, et al.
0

On-device end-to-end speech recognition poses a high requirement on model efficiency. Most prior works improve the efficiency by reducing model sizes. We propose to reduce the complexity of model architectures in addition to model sizes. More specifically, we reduce the floating-point operations in conformer by replacing the transformer module with a performer. The proposed attention-based efficient end-to-end speech recognition model yields competitive performance on the LibriSpeech corpus with 10 millions of parameters and linear computation complexity. The proposed model also outperforms previous lightweight end-to-end models by about 20 word error rate.

READ FULL TEXT

page 1

page 2

page 3

page 4

10/17/2016

End-to-end attention-based distant speech recognition with Highway LSTM

End-to-end attention-based models have been shown to be competitive alte...
10/30/2019

Lightweight and Efficient End-to-End Speech Recognition Using Low-Rank Transformer

High performing deep neural networks come at the cost of computational c...
09/17/2022

Parameter-Efficient Conformers via Sharing Sparsely-Gated Experts for End-to-End Speech Recognition

While transformers and their variant conformers show promising performan...
10/23/2020

Transformer-based End-to-End Speech Recognition with Local Dense Synthesizer Attention

Recently, several studies reported that dot-product selfattention (SA) m...
01/27/2020

Scaling Up Online Speech Recognition Using ConvNets

We design an online end-to-end speech recognition system based on Time-D...
05/21/2020

End-to-End Far-Field Speech Recognition with Unified Dereverberation and Beamforming

Despite successful applications of end-to-end approaches in multi-channe...
12/19/2017

Improving End-to-End Speech Recognition with Policy Learning

Connectionist temporal classification (CTC) is widely used for maximum l...