Single headed attention based sequence-to-sequence model for state-of-the-art results on Switchboard-300

01/20/2020
by   Zoltán Tüske, et al.
0

It is generally believed that direct sequence-to-sequence (seq2seq) speech recognition models are competitive with hybrid models only when a large amount of data, at least a thousand hours, is available for training. In this paper, we show that state-of-the-art recognition performance can be achieved on the Switchboard-300 database using a single headed attention, LSTM based model. Using a cross-utterance language model, our single-pass speaker independent system reaches 6.4 CallHome subsets of Hub5'00, without a pronunciation lexicon. While careful regularization and data augmentation are crucial in achieving this level of performance, experiments on Switchboard-2000 show that nothing is more useful than more data.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/28/2018

Syllable-Based Sequence-to-Sequence Speech Recognition with the Transformer in Mandarin Chinese

Sequence-to-sequence attention-based models have recently shown very pro...
research
10/29/2019

Improving sequence-to-sequence speech recognition training with on-the-fly data augmentation

Sequence-to-Sequence (S2S) models recently started to show state-of-the-...
research
12/08/2016

Towards better decoding and language model integration in sequence to sequence models

The recently proposed Sequence-to-Sequence (seq2seq) framework advocates...
research
09/28/2018

Audio-Visual Speech Recognition With A Hybrid CTC/Attention Architecture

Recent works in speech recognition rely either on connectionist temporal...
research
04/08/2021

CLVSA: A Convolutional LSTM Based Variational Sequence-to-Sequence Model with Attention for Predicting Trends of Financial Markets

Financial markets are a complex dynamical system. The complexity comes f...
research
05/03/2021

On the limit of English conversational speech recognition

In our previous work we demonstrated that a single headed attention enco...
research
02/05/2019

Model Unit Exploration for Sequence-to-Sequence Speech Recognition

We evaluate attention-based encoder-decoder models along two dimensions:...

Please sign up or login with your details

Forgot password? Click here to reset