Nebula: F0 Estimation and Voicing Detection by Modeling the Statistical Properties of Feature Extractors

10/31/2017
by   Kanru Hua, et al.
0

A F0 and voicing status estimation algorithm for speech analysis/synthesis is proposed. Instead of directly modeling speech signals, the proposed algorithm models the behavior of feature extractors under additive noise using a bank of Gaussian mixture models, trained on artificial data generated from Monte-Carlo simulations. The conditional distributions of F0 predicted by the GMMs are combined to generate a likelihood map, which is then smoothed by a Viterbi search to give the final F0 trajectory. The voicing decision is obtained based on the peak F0 likelihood. The proposed method achieves an average F0 gross error of 0.30

READ FULL TEXT

page 2

page 3

research
05/02/2023

On the properties of Gaussian Copula Mixture Models

Gaussian copula mixture models (GCMM) are the generalization of Gaussian...
research
12/09/2022

Uncertainty Estimation in Deep Speech Enhancement Using Complex Gaussian Mixture Models

Single-channel deep speech enhancement approaches often estimate a singl...
research
01/28/2020

Subband Weighting for Binaural Speech Source Localization

We consider the task of speech source localization from a bin-aural reco...
research
09/09/2017

Optimization assisted MCMC

Markov Chain Monte Carlo (MCMC) sampling methods are widely used but oft...
research
02/27/2017

Scalable and Distributed Clustering via Lightweight Coresets

Coresets are compact representations of data sets such that models train...
research
08/13/2021

Self-Calibrating the Look-Elsewhere Effect: Fast Evaluation of the Statistical Significance Using Peak Heights

In experiments where one searches a large parameter space for an anomaly...

Please sign up or login with your details

Forgot password? Click here to reset