Transformer with Bidirectional Decoder for Speech Recognition

08/11/2020
by   Xi Chen, et al.
0

Attention-based models have made tremendous progress on end-to-end automatic speech recognition(ASR) recently. However, the conventional transformer-based approaches usually generate the sequence results token by token from left to right, leaving the right-to-left contexts unexploited. In this work, we introduce a bidirectional speech transformer to utilize the different directional contexts simultaneously. Specifically, the outputs of our proposed transformer include a left-to-right target, and a right-to-left target. In inference stage, we use the introduced bidirectional beam search method, which can not only generate left-to-right candidates but also generate right-to-left candidates, and determine the best hypothesis by the score. To demonstrate our proposed speech transformer with a bidirectional decoder(STBD), we conduct extensive experiments on the AISHELL-1 dataset. The results of experiments show that STBD achieves a 3.6% relative CER reduction(CERR) over the unidirectional speech transformer baseline. Besides, the strongest model in this paper called STBD-Big can achieve 6.64% CER on the test set, without language model rescoring and any extra data augmentation strategies.

READ FULL TEXT
research
09/14/2021

Non-autoregressive Transformer with Unified Bidirectional Decoder for Automatic Speech Recognition

Non-autoregressive (NAR) transformer models have been studied intensivel...
research
05/21/2023

A Framework for Bidirectional Decoding: Case Study in Morphological Inflection

Transformer-based encoder-decoder models that generate outputs in a left...
research
10/07/2021

Beam Search with Bidirectional Strategies for Neural Response Generation

Sequence-to-sequence neural networks have been widely used in language-b...
research
10/08/2020

Masked ELMo: An evolution of ELMo towards fully contextual RNN language models

This paper presents Masked ELMo, a new RNN-based model for language mode...
research
08/28/2019

Solving Math Word Problems with Double-Decoder Transformer

This paper proposes a Transformer-based model to generate equations for ...
research
01/06/2022

Compact Bidirectional Transformer for Image Captioning

Most current image captioning models typically generate captions from le...
research
08/09/2020

Distilling the Knowledge of BERT for Sequence-to-Sequence ASR

Attention-based sequence-to-sequence (seq2seq) models have achieved prom...

Please sign up or login with your details

Forgot password? Click here to reset