The Deterministic plus Stochastic Model of the Residual Signal and its Applications

12/29/2019
by   Thomas Drugman, et al.
0

The modeling of speech production often relies on a source-filter approach. Although methods parameterizing the filter have nowadays reached a certain maturity, there is still a lot to be gained for several speech processing applications in finding an appropriate excitation model. This manuscript presents a Deterministic plus Stochastic Model (DSM) of the residual signal. The DSM consists of two contributions acting in two distinct spectral bands delimited by a maximum voiced frequency. Both components are extracted from an analysis performed on a speaker-dependent dataset of pitch-synchronous residual frames. The deterministic part models the low-frequency contents and arises from an orthonormal decomposition of these frames. As for the stochastic component, it is a high-frequency noise modulated both in time and frequency. Some interesting phonetic and computational properties of the DSM are also highlighted. The applicability of the DSM in two fields of speech processing is then studied. First, it is shown that incorporating the DSM vocoder in HMM-based speech synthesis enhances the delivered quality. The proposed approach turns out to significantly outperform the traditional pulse excitation and provides a quality equivalent to STRAIGHT. In a second application, the potential of glottal signatures derived from the proposed DSM is investigated for speaker identification purpose. Interestingly, these signatures are shown to lead to better recognition rates than other glottal-based methods.

READ FULL TEXT
research
12/29/2019

A Deterministic plus Stochastic Model of the Residual Signal for Improved Parametric Speech Synthesis

Speech generated by parametric synthesizers generally suffers from a typ...
research
03/05/2022

NeuralDPS: Neural Deterministic Plus Stochastic Model with Multiband Excitation for Noise-Controllable Waveform Generation

The traditional vocoders have the advantages of high synthesis efficienc...
research
12/30/2019

Using a Pitch-Synchronous Residual Codebook for Hybrid HMM/Frame Selection Speech Synthesis

This paper proposes a method to improve the quality delivered by statist...
research
11/09/2018

ExcitNet vocoder: A neural excitation model for parametric speech synthesis systems

This paper proposes a WaveNet-based neural excitation model (ExcitNet) f...
research
04/03/2018

Speech waveform synthesis from MFCC sequences with generative adversarial networks

This paper proposes a method for generating speech from filterbank mel f...
research
06/07/2020

Parametric Representation for Singing Voice Synthesis: a Comparative Evaluation

Various parametric representations have been proposed to model the speec...
research
06/07/2020

Maximum Phase Modeling for Sparse Linear Prediction of Speech

Linear prediction (LP) is an ubiquitous analysis method in speech proces...

Please sign up or login with your details

Forgot password? Click here to reset