Improving the Performance of Online Neural Transducer Models

12/05/2017
by   Tara N. Sainath, et al.
0

Having a sequence-to-sequence model which can operate in an online fashion is important for streaming applications such as Voice Search. Neural transducer is a streaming sequence-to-sequence model, but has shown a significant degradation in performance compared to non-streaming models such as Listen, Attend and Spell (LAS). In this paper, we present various improvements to NT. Specifically, we look at increasing the window over which NT computes attention, mainly by looking backwards in time so the model still remains online. In addition, we explore initializing a NT model from a LAS-trained model so that it is guided with a better alignment. Finally, we explore including stronger language models such as using wordpiece models, and applying an external LM during the beam search. On a Voice Search task, we find with these improvements we can get NT to match the performance of LAS.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/06/2017

An analysis of incorporating an external language model into a sequence-to-sequence model

Attention-based sequence-to-sequence models for automatic speech recogni...
research
04/10/2020

Minimum Latency Training Strategies for Streaming Sequence-to-Sequence ASR

Recently, a few novel streaming attention-based sequence-to-sequence (S2...
research
12/05/2017

State-of-the-art Speech Recognition With Sequence-to-Sequence Models

Attention-based encoder-decoder architectures such as Listen, Attend, an...
research
10/07/2021

Beam Search with Bidirectional Strategies for Neural Response Generation

Sequence-to-sequence neural networks have been widely used in language-b...
research
07/31/2021

Voice Reconstruction from Silent Speech with a Sequence-to-Sequence Model

Silent Speech Decoding (SSD) based on Surface electromyography (sEMG) ha...
research
04/18/2023

Dynamic Chunk Convolution for Unified Streaming and Non-Streaming Conformer ASR

Recently, there has been an increasing interest in unifying streaming an...
research
10/14/2022

Bayes risk CTC: Controllable CTC alignment in Sequence-to-Sequence tasks

Sequence-to-Sequence (seq2seq) tasks transcribe the input sequence to a ...

Please sign up or login with your details

Forgot password? Click here to reset