A Bayesian Approach to Estimation of Speaker Normalization Parameters

10/19/2016
by   Dhananjay Ram, et al.
0

In this work, a Bayesian approach to speaker normalization is proposed to compensate for the degradation in performance of a speaker independent speech recognition system. The speaker normalization method proposed herein uses the technique of vocal tract length normalization (VTLN). The VTLN parameters are estimated using a novel Bayesian approach which utilizes the Gibbs sampler, a special type of Markov Chain Monte Carlo method. Additionally the hyperparameters are estimated using maximum likelihood approach. This model is used assuming that human vocal tract can be modeled as a tube of uniform cross section. It captures the variation in length of the vocal tract of different speakers more effectively, than the linear model used in literature. The work has also investigated different methods like minimization of Mean Square Error (MSE) and Mean Absolute Error (MAE) for the estimation of VTLN parameters. Both single pass and two pass approaches are then used to build a VTLN based speech recognizer. Experimental results on recognition of vowels and Hindi phrases from a medium vocabulary indicate that the Bayesian method improves the performance by a considerable margin.

READ FULL TEXT

page 14

page 15

research
09/02/2022

The E-Bayesian Estimation and its E-MSE of Lomax distribution under different loss functions

This paper studies the E-Bayesian (expectation of the Bayesian estimatio...
research
06/21/2019

Parameter Identification in Viscoplasticity using Transitional Markov Chain Monte Carlo Method

To evaluate the cyclic behavior under different loading conditions using...
research
04/18/2020

Bayesian Parameter Identification for Jump Markov Linear Systems

This paper presents a Bayesian method for identification of jump Markov ...
research
07/19/2017

Single-Channel Multi-talker Speech Recognition with Permutation Invariant Training

Although great progresses have been made in automatic speech recognition...
research
02/22/2016

Blind score normalization method for PLDA based speaker recognition

Probabilistic Linear Discriminant Analysis (PLDA) has become state-of-th...
research
07/19/2017

Dynamic Layer Normalization for Adaptive Neural Acoustic Modeling in Speech Recognition

Layer normalization is a recently introduced technique for normalizing t...
research
10/30/2020

Deep Speaker Vector Normalization with Maximum Gaussianality Training

Deep speaker embedding represents the state-of-the-art technique for spe...

Please sign up or login with your details

Forgot password? Click here to reset