Non-Autoregressive ASR with Self-Conditioned Folded Encoders

02/17/2022
by   Tatsuya Komatsu, et al.
0

This paper proposes CTC-based non-autoregressive ASR with self-conditioned folded encoders. The proposed method realizes non-autoregressive ASR with fewer parameters by folding the conventional stack of encoders into only two blocks; base encoders and folded encoders. The base encoders convert the input audio features into a neural representation suitable for recognition. This is followed by the folded encoders applied repeatedly for further refinement. Applying the CTC loss to the outputs of all encoders enforces the consistency of the input-output relationship. Thus, folded encoders learn to perform the same operations as an encoder with deeper distinct layers. In experiments, we investigate how to set the number of layers and the number of iterations for the base and folded encoders. The results show that the proposed method achieves a performance comparable to that of the conventional method using only 38 when increasing the number of iterations.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/27/2020

Cascaded encoders for unifying streaming and non-streaming ASR

End-to-end (E2E) automatic speech recognition (ASR) models, by now, have...
research
03/29/2022

Streaming parallel transducer beam search with fast-slow cascaded encoders

Streaming ASR with strict latency constraints is required in many speech...
research
10/10/2021

Multi-Channel End-to-End Neural Diarization with Distributed Microphones

Recent progress on end-to-end neural diarization (EEND) has enabled over...
research
09/08/2022

Non-autoregressive Error Correction for CTC-based ASR with Phone-conditioned Masked LM

Connectionist temporal classification (CTC) -based models are attractive...
research
04/01/2022

InterAug: Augmenting Noisy Intermediate Predictions for CTC-based ASR

This paper proposes InterAug: a novel training method for CTC-based ASR ...
research
09/01/2020

Object Detection-Based Variable Quantization Processing

In this paper, we propose a preprocessing method for conventional image ...
research
07/17/2023

A benchmark of categorical encoders for binary classification

Categorical encoders transform categorical features into numerical repre...

Please sign up or login with your details

Forgot password? Click here to reset