Maximum Voiced Frequency Estimation: Exploiting Amplitude and Phase Spectra

05/31/2020
by   Thomas Drugman, et al.
0

Maximum Voiced Frequency (MVF) is used in various speech models as the spectral boundary separating periodic and aperiodic components during the production of voiced sounds. Recent studies have shown that its proper estimation and modeling enhance the quality of statistical parametric speech synthesizers. Contrastingly, these same methods of MVF estimation have been reported to degrade the performance of singing voice synthesizers. This paper proposes a new approach for MVF estimation which exploits both amplitude and phase spectra. It is shown that phase conveys relevant information about the harmonicity of the voice signal, and that it can be jointly used with features derived from the amplitude spectrum. This information is further integrated into a maximum likelihood criterion which provides a decision about the MVF estimate. The proposed technique is compared to two state-of-the-art methods, and shows a superior performance in both objective and subjective evaluations. Perceptual tests indicate a drastic improvement in high-pitched voices.

READ FULL TEXT

page 1

page 4

research
10/29/2018

STFT spectral loss for training a neural speech waveform model

This paper proposes a new loss using short-time Fourier transform (STFT)...
research
01/02/2020

Phase-based Information for Voice Pathology Detection

In most current approaches of speech processing, information is extracte...
research
04/10/2022

Inferring Pitch from Coarse Spectral Features

Fundamental frequency (F0) has long been treated as the physical definit...
research
09/03/2019

Quantifying and Correlating Rhythm Formants in Speech

The objective of the present study is exploratory: to introduce and appl...
research
05/24/2020

Glottal source estimation robustness: A comparison of sensitivity of voice source estimation techniques

This paper addresses the problem of estimating the voice source directly...
research
03/18/2019

CRAFT: A multifunction online platform for speech prosody visualisation

There are many research tools which are also used for teaching the acous...

Please sign up or login with your details

Forgot password? Click here to reset