Integrating Source-channel and Attention-based Sequence-to-sequence Models for Speech Recognition

09/14/2019
by   Qiujia Li, et al.
0

This paper proposes a novel automatic speech recognition (ASR) framework called Integrated Source-Channel and Attention (ISCA) that combines the advantages of traditional systems based on the noisy source-channel model (SC) and end-to-end style systems using attention-based sequence-to-sequence models. The traditional SC system framework includes hidden Markov models and connectionist temporal classification (CTC) based acoustic models, language models (LMs), and a decoding procedure based on a lexicon, whereas the end-to-end style attention-based system jointly models the whole process with a single model. By rescoring the hypotheses produced by traditional systems using end-to-end style systems based on an extended noisy source-channel model, ISCA allows structured knowledge to be easily incorporated via the SC-based model while exploiting the complementarity of the attention-based model. Experiments on the AMI meeting corpus show that ISCA is able to give a relative word error rate reduction up to 21 alternative method which also involves combining CTC and attention-based models.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/22/2020

Confidence Estimation for Attention-based Sequence-to-sequence Models for Speech Recognition

For various speech-related tasks, confidence scores from a speech recogn...
research
12/05/2017

Minimum Word Error Rate Training for Attention-based Sequence-to-Sequence Models

Sequence-to-sequence models, such as attention-based models in automatic...
research
11/03/2020

Streaming Attention-Based Models with Augmented Memory for End-to-End Speech Recognition

Attention-based models have been gaining popularity recently for their s...
research
12/08/2016

Towards better decoding and language model integration in sequence to sequence models

The recently proposed Sequence-to-Sequence (seq2seq) framework advocates...
research
04/25/2022

Supervised Attention in Sequence-to-Sequence Models for Speech Recognition

Attention mechanism in sequence-to-sequence models is designed to model ...
research
02/05/2019

Model Unit Exploration for Sequence-to-Sequence Speech Recognition

We evaluate attention-based encoder-decoder models along two dimensions:...
research
11/07/2018

Promising Accurate Prefix Boosting for sequence-to-sequence ASR

In this paper, we present promising accurate prefix boosting (PAPB), a d...

Please sign up or login with your details

Forgot password? Click here to reset