Context-Dependent Acoustic Modeling without Explicit Phone Clustering

05/15/2020
by   Tina Raissi, et al.
0

Phoneme-based acoustic modeling of large vocabulary automatic speech recognition takes advantage of phoneme context. The large number of context-dependent (CD) phonemes and their highly varying statistics require tying or smoothing to enable robust training. Usually, Classification and Regression Trees are used for phonetic clustering, which is standard in Hidden Markov Model (HMM)-based systems. However, this solution introduces a secondary training objective and does not allow for end-to-end training. In this work, we address a direct phonetic context modeling for the hybrid Deep Neural Network (DNN)/HMM, that does not build on any phone clustering algorithm for the determination of the HMM state inventory. By performing different decompositions of the joint probability of the center phoneme state and its left and right contexts, we obtain a factorized network consisting of different components, trained jointly. Moreover, the representation of the phonetic context for the network relies on phoneme embeddings. The recognition accuracy of our proposed models on the Switchboard task is comparable and outperforms slightly the hybrid model using the standard state-tying decision trees.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/24/2022

Improving Factored Hybrid HMM Acoustic Modeling without State Tying

In this work, we show that a factored hybrid hidden Markov model (FH-HMM...
research
01/14/2019

Towards Using Context-Dependent Symbols in CTC Without State-Tying Decision Trees

Deep neural acoustic models benefit from context dependent modeling of o...
research
10/02/2019

From Senones to Chenones: Tied Context-Dependent Graphemes for Hybrid Speech Recognition

There is an implicit assumption that traditional hybrid approaches for a...
research
03/03/2018

On Modular Training of Neural Acoustics-to-Word Model for LVCSR

End-to-end (E2E) automatic speech recognition (ASR) systems directly map...
research
06/14/2016

Calibration of Phone Likelihoods in Automatic Speech Recognition

In this paper we study the probabilistic properties of the posteriors in...
research
01/22/2016

Exploiting Low-dimensional Structures to Enhance DNN Based Acoustic Modeling in Speech Recognition

We propose to model the acoustic space of deep neural network (DNN) clas...
research
06/15/2023

Competitive and Resource Efficient Factored Hybrid HMM Systems are Simpler Than You Think

Building competitive hybrid hidden Markov model (HMM) systems for automa...

Please sign up or login with your details

Forgot password? Click here to reset