Beam Search Decoding using Manner of Articulation Detection Knowledge Derived from Connectionist Temporal Classification
Manner of articulation detection using deep neural networks require a priori knowledge of the attribute discriminative features or the decent phoneme alignments. However generating an appropriate phoneme alignment is complex and its performance depends on the choice of optimal number of senones, Gaussians, etc. In the first part of our work, we exploit the manner of articulation detection using connectionist temporal classification (CTC) which doesn't need any phoneme alignment. Later we modify the state-of-the-art character based posteriors generated by CTC using the manner of articulation CTC detector. Beam search decoding is performed on the modified posteriors and it's impact on open source datasets such as AN4 and LibriSpeech is observed.
READ FULL TEXT