A Conformer Based Acoustic Model for Robust Automatic Speech Recognition

03/01/2022
by   Yufeng Yang, et al.
0

This study addresses robust automatic speech recognition (ASR) by introducing a Conformer-based acoustic model. The proposed model builds on a state-of-the-art recognition system using a bi-directional long short-term memory (BLSTM) model with utterance-wise dropout and iterative speaker adaptation, but employs a Conformer encoder instead of the BLSTM network. The Conformer encoder uses a convolution-augmented attention mechanism for acoustic modeling. The proposed system is evaluated on the monaural ASR task of the CHiME-4 corpus. Coupled with utterance-wise normalization and speaker adaptation, our model achieves 6.25% word error rate, which outperforms the previous best system by 8.4% relatively. In addition, the proposed Conformer-based model is 18.3% smaller in model size and reduces total training time by 79.6%.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/14/2019

Cumulative Adaptation for BLSTM Acoustic Models

This paper addresses the robust speech recognition problem as an adaptat...
research
04/21/2022

Layer-wise Fast Adaptation for End-to-End Multi-Accent Speech Recognition

Accent variability has posed a huge challenge to automatic speech recogn...
research
01/01/2020

Attentive batch normalization for lstm-based acoustic modeling of speech recognition

Batch normalization (BN) is an effective method to accelerate model trai...
research
11/04/2020

Frustratingly Easy Noise-aware Training of Acoustic Models

Environmental noises and reverberation have a detrimental effect on the ...
research
01/02/2020

Attention based on-device streaming speech recognition with large speech corpus

In this paper, we present a new on-device automatic speech recognition (...
research
11/20/2020

Improving RNN-T ASR Accuracy Using Untranscribed Context Audio

We present a new training scheme for streaming automatic speech recognit...
research
08/19/2020

Cross-Utterance Language Models with Acoustic Error Sampling

The effective exploitation of richer contextual information in language ...

Please sign up or login with your details

Forgot password? Click here to reset