Sequence-to-Sequence Learning via Attention Transfer for Incremental Speech Recognition

11/04/2020
by   Sashi Novitasari, et al.
0

Attention-based sequence-to-sequence automatic speech recognition (ASR) requires a significant delay to recognize long utterances because the output is generated after receiving entire input sequences. Although several studies recently proposed sequence mechanisms for incremental speech recognition (ISR), using different frameworks and learning algorithms is more complicated than the standard ASR model. One main reason is because the model needs to decide the incremental steps and learn the transcription that aligns with the current short speech segment. In this work, we investigate whether it is possible to employ the original architecture of attention-based ASR for ISR tasks by treating a full-utterance ASR as the teacher model and the ISR as the student model. We design an alternative student network that, instead of using a thinner or a shallower model, keeps the original architecture of the teacher model but with shorter sequences (few encoder and decoder states). Using attention transfer, the student network learns to mimic the same alignment between the current input short speech segments and the transcription. Our experiments show that by delaying the starting time of recognition process with about 1.7 sec, we can achieve comparable performance to one that needs to wait until the end.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/20/2019

On using 2D sequence-to-sequence models for speech recognition

Attention-based sequence-to-sequence models have shown promising results...
research
11/16/2015

A Neural Transducer

Sequence-to-sequence models have achieved impressive results on various ...
research
11/04/2020

Incremental Machine Speech Chain Towards Enabling Listening while Speaking in Real-time

Inspired by a human speech chain mechanism, a machine speech chain frame...
research
06/08/2020

Learning to Count Words in Fluent Speech enables Online Speech Recognition

Sequence to Sequence models, in particular the Transformer, achieve stat...
research
10/30/2017

Sequence-to-Sequence ASR Optimization via Reinforcement Learning

Despite the success of sequence-to-sequence approaches in automatic spee...
research
11/12/2018

Sequence-Level Knowledge Distillation for Model Compression of Attention-based Sequence-to-Sequence Speech Recognition

We investigate the feasibility of sequence-level knowledge distillation ...
research
12/10/2021

Sequence-level self-learning with multiple hypotheses

In this work, we develop new self-learning techniques with an attention-...

Please sign up or login with your details

Forgot password? Click here to reset