A Novel Topology for End-to-end Temporal Classification and Segmentation with Recurrent Neural Network

12/10/2019
by   Taiyang Zhao, et al.
0

Connectionist temporal classification (CTC) has matured as an alignment free to sequence transduction and shows competitive for end-to-end speech recognition. In the CTC topology, the blank symbol occupies more than half of the state trellis, which results the spike phenomenon of the non-blank symbols. For classification task, the spikes work quite well, but as to the segmentation task it does not provide boundaries information. In this paper, a novel topology is introduced to combine the temporal classification and segmentation ability in one framework.

READ FULL TEXT
research
09/11/2016

Wav2Letter: an End-to-End ConvNet-based Speech Recognition System

This paper presents a simple end-to-end model for speech recognition, co...
research
12/04/2014

End-to-end Continuous Speech Recognition using Attention-based Recurrent NN: First Results

We replace the Hidden Markov Model (HMM) which is traditionally used in ...
research
02/21/2017

Multitask Learning with CTC and Segmental CRF for Speech Recognition

Segmental conditional random fields (SCRFs) and connectionist temporal c...
research
11/21/2016

Robust end-to-end deep audiovisual speech recognition

Speech is one of the most effective ways of communication among humans. ...
research
08/01/2016

Blind phoneme segmentation with temporal prediction errors

Phonemic segmentation of speech is a critical step of speech recognition...
research
04/18/2016

End-to-End Tracking and Semantic Segmentation Using Recurrent Neural Networks

In this work we present a novel end-to-end framework for tracking and cl...
research
08/01/2018

Recurrent neural networks for aortic image sequence segmentation with sparse annotations

Segmentation of image sequences is an important task in medical image an...

Please sign up or login with your details

Forgot password? Click here to reset