Intermediate-layer output Regularization for Attention-based Speech Recognition with Shared Decoder

07/09/2022
by   Jicheng Zhang, et al.
0

Intermediate layer output (ILO) regularization by means of multitask training on encoder side has been shown to be an effective approach to yielding improved results on a wide range of end-to-end ASR frameworks. In this paper, we propose a novel method to do ILO regularized training differently. Instead of using conventional multitask methods that entail more training overhead, we directly make the intermediate layer output as input to the decoder, that is, our decoder not only accepts the output of the final encoder layer as input, it also takes the output of the encoder ILO as input during training. With the proposed method, as both encoder and decoder are simultaneously "regularized", the network is more sufficiently trained, consistently leading to improved results, over the ILO-based CTC method, as well as over the original attention-based modeling method without the proposed method employed.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/05/2017

Multitask Learning with Low-Level Auxiliary Tasks for Encoder-Decoder Based Speech Recognition

End-to-end training of deep learning-based models allows for implicit le...
research
07/25/2022

Effective and Interpretable Information Aggregation with Capacity Networks

How to aggregate information from multiple instances is a key question m...
research
04/22/2018

Multi-Head Decoder for End-to-End Speech Recognition

This paper presents a new network architecture called multi-head decoder...
research
06/15/2020

Regularized Forward-Backward Decoder for Attention Models

Nowadays, attention models are one of the popular candidates for speech ...
research
11/02/2020

Focus on the present: a regularization method for the ASR source-target attention layer

This paper introduces a novel method to diagnose the source-target atten...
research
11/09/2019

Speaker Adaptation for Attention-Based End-to-End Speech Recognition

We propose three regularization-based speaker adaptation approaches to a...
research
04/01/2022

InterAug: Augmenting Noisy Intermediate Predictions for CTC-based ASR

This paper proposes InterAug: a novel training method for CTC-based ASR ...

Please sign up or login with your details

Forgot password? Click here to reset