RNN-T models are widely used in ASR, which rely on the RNN-T loss to ach...
The local and global features are both essential for automatic speech
re...
Attention-based encoder-decoder (AED) models have shown impressive
perfo...
Non-autoregressive (NAR) transformer models have been studied intensivel...