Hierarchical Multi Task Learning With CTC

07/18/2018
by   Ramon Sanabria, et al.
0

In Automatic Speech Recognition, it is still challenging to learn useful intermediate representations when using of high-level (or abstract) target units such as words. Character or phoneme based systems tend to outperform word based systems as long as thousands of hours of training data are being used. In this paper, we first show how hierarchical multi-task training can encourage the formation of useful intermediate representations. We achieve this by performing Connectionist Temporal Classification at different levels of the network with targets of different granularity. Our model thus performs predictions in multiple scales of granularity for the same input. On the standard 300h Switchboard training setup, our hierarchical multi-task architecture exhibits improvements over single-task architectures with the same number of parameters. Our model obtains 14.0 Switchboard subset without any decoder or language model, outperforming the current state-of-the-art on acoustic-to-word models.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/28/2020

Enhancing Handwritten Text Recognition with N-gram sequence decomposition and Multitask Learning

Current state-of-the-art approaches in the field of Handwritten Text Rec...
research
04/01/2022

Multi-sequence Intermediate Conditioning for CTC-based ASR

End-to-end automatic speech recognition (ASR) directly maps input speech...
research
12/28/2019

Improved Multi-Stage Training of Online Attention-based Encoder-Decoder Models

In this paper, we propose a refined multi-stage multi-task training stra...
research
11/28/2018

On the Inductive Bias of Word-Character-Level Multi-Task Learning for Speech Recognition

End-to-end automatic speech recognition (ASR) commonly transcribes audio...
research
02/02/2019

Using multi-task learning to improve the performance of acoustic-to-word and conventional hybrid models

Acoustic-to-word (A2W) models that allow direct mapping from acoustic si...
research
11/15/2022

Hierarchical Pronunciation Assessment with Multi-Aspect Attention

Automatic pronunciation assessment is a major component of a computer-as...
research
08/24/2023

MultiPA: a multi-task speech pronunciation assessment system for a closed and open response scenario

The design of automatic speech pronunciation assessment can be categoriz...

Please sign up or login with your details

Forgot password? Click here to reset