TSNAT: Two-Step Non-Autoregressvie Transformer Models for Speech Recognition

04/04/2021
by   Zhengkun Tian, et al.
0

The autoregressive (AR) models, such as attention-based encoder-decoder models and RNN-Transducer, have achieved great success in speech recognition. They predict the output sequence conditioned on the previous tokens and acoustic encoded states, which is inefficient on GPUs. The non-autoregressive (NAR) models can get rid of the temporal dependency between the output tokens and predict the entire output tokens in at least one step. However, the NAR model still faces two major problems. On the one hand, there is still a great gap in performance between the NAR models and the advanced AR models. On the other hand, it's difficult for most of the NAR models to train and converge. To address these two problems, we propose a new model named the two-step non-autoregressive transformer(TSNAT), which improves the performance and accelerating the convergence of the NAR model by learning prior knowledge from a parameters-sharing AR model. Furthermore, we introduce the two-stage method into the inference process, which improves the model performance greatly. All the experiments are conducted on a public Chinese mandarin dataset ASIEHLL-1. The results show that the TSNAT can achieve a competitive performance with the AR model and outperform many complicated NAR models.

READ FULL TEXT
research
04/07/2021

FSR: Accelerating the Inference Process of Transducer-Based Models by Applying Fast-Skip Regularization

Transducer-based models, such as RNN-Transducer and transformer-transduc...
research
06/16/2022

Paraformer: Fast and Accurate Parallel Transformer for Non-autoregressive End-to-End Speech Recognition

Transformers have recently dominated the ASR field. Although able to yie...
research
10/28/2020

Non-Autoregressive Transformer ASR with CTC-Enhanced Decoder Input

Non-autoregressive (NAR) transformer models have achieved significantly ...
research
07/27/2023

MIM-OOD: Generative Masked Image Modelling for Out-of-Distribution Detection in Medical Images

Unsupervised Out-of-Distribution (OOD) detection consists in identifying...
research
06/06/2015

Data-Driven Learning of the Number of States in Multi-State Autoregressive Models

In this work, we consider the class of multi-state autoregressive proces...
research
03/18/2019

Autoregressive Models for Sequences of Graphs

This paper proposes an autoregressive (AR) model for sequences of graphs...
research
09/14/2023

AAS-VC: On the Generalization Ability of Automatic Alignment Search based Non-autoregressive Sequence-to-sequence Voice Conversion

Non-autoregressive (non-AR) sequence-to-seqeunce (seq2seq) models for vo...

Please sign up or login with your details

Forgot password? Click here to reset