Scaling Up Online Speech Recognition Using ConvNets

01/27/2020
by   Vineel Pratap, et al.
0

We design an online end-to-end speech recognition system based on Time-Depth Separable (TDS) convolutions and Connectionist Temporal Classification (CTC). We improve the core TDS architecture in order to limit the future context and hence reduce latency while maintaining accuracy. The system has almost three times the throughput of a well tuned hybrid ASR baseline while also having lower latency and a better word error rate. Also important to the efficiency of the recognizer is our highly optimized beam search decoder. To show the impact of our design choices, we analyze throughput, latency, accuracy, and discuss how these metrics can be tuned based on the user requirements.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/02/2023

Streaming Speech-to-Confusion Network Speech Recognition

In interactive automatic speech recognition (ASR) systems, low-latency r...
research
03/31/2021

Compressing 1D Time-Channel Separable Convolutions using Sparse Random Ternary Matrices

We demonstrate that 1x1-convolutions in 1D time-channel separable convol...
research
10/29/2018

An improved hybrid CTC-Attention model for speech recognition

Recently, end-to-end speech recognition with a hybrid model consisting o...
research
10/15/2020

Lightweight End-to-End Speech Recognition from Raw Audio Data Using Sinc-Convolutions

Many end-to-end Automatic Speech Recognition (ASR) systems still rely on...
research
04/06/2021

Dissecting User-Perceived Latency of On-Device E2E Speech Recognition

As speech-enabled devices such as smartphones and smart speakers become ...
research
04/08/2022

Adding Connectionist Temporal Summarization into Conformer to Improve Its Decoder Efficiency For Speech Recognition

The Conformer model is an excellent architecture for speech recognition ...
research
09/15/2023

Augmenting conformers with structured state space models for online speech recognition

Online speech recognition, where the model only accesses context to the ...

Please sign up or login with your details

Forgot password? Click here to reset