Deep Triphone Embedding Improves Phoneme Recognition

10/22/2017
by   Mohit Yadav, et al.
0

In this paper, we present a novel Deep Triphone Embedding (DTE) representation derived from Deep Neural Network (DNN) to encapsulate the discriminative information present in the adjoining speech frames. DTEs are generated using a four hidden layer DNN with 3000 nodes in each hidden layer at the first-stage. This DNN is trained with the tied-triphone classification accuracy as an optimization criterion. Thereafter, we retain the activation vectors (3000) of the last hidden layer, for each speech MFCC frame, and perform dimension reduction to further obtain a 300 dimensional representation, which we termed as DTE. DTEs along with MFCC features are fed into a second-stage four hidden layer DNN, which is subsequently trained for the task of tied-triphone classification. Both DNNs are trained using tri-phone labels generated from a tied-state triphone HMM-GMM system, by performing a forced-alignment between the transcriptions and MFCC feature frames. We conduct the experiments on publicly available TED-LIUM speech corpus. The results show that the proposed DTE method provides an improvement of absolute 2.11 phoneme recognition, when compared with a competitive hybrid tied-state triphone HMM-DNN system.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/17/2016

Automatic Node Selection for Deep Neural Networks using Group Lasso Regularization

We examine the effect of the Group Lasso (gLasso) regularizer in selecti...
research
02/18/2015

F0 Modeling In Hmm-Based Speech Synthesis System Using Deep Belief Network

In recent years multilayer perceptrons (MLPs) with many hid- den layers ...
research
07/06/2019

Towards Debugging Deep Neural Networks by Generating Speech Utterances

Deep neural networks (DNN) are able to successfully process and classify...
research
06/14/2016

Calibration of Phone Likelihoods in Automatic Speech Recognition

In this paper we study the probabilistic properties of the posteriors in...
research
12/21/2018

End-to-End Classification of Reverberant Rooms using DNNs

Reverberation is present in our workplaces, our homes and even in places...
research
08/01/2022

Backdoor Watermarking Deep Learning Classification Models With Deep Fidelity

Backdoor Watermarking is a promising paradigm to protect the copyright o...
research
03/23/2021

Joint Distribution across Representation Space for Out-of-Distribution Detection

Deep neural networks (DNNs) have become a key part of many modern softwa...

Please sign up or login with your details

Forgot password? Click here to reset