LP-WaveNet: Linear Prediction-based WaveNet Speech Synthesis

11/29/2018
by   Min-Jae Hwang, et al.
0

We propose a linear prediction (LP)-based waveform generation method via WaveNet speech synthesis. The WaveNet vocoder, which uses speech parameters as a conditional input of WaveNet, has significantly improved the quality of statistical parametric speech synthesis system. However, it is still challenging to effectively train the neural vocoder when the target database becomes larger and more expressive. As a solution, the approaches that only generate the vocal source signal by the neural vocoder have been proposed. However, they tend to generate synthetic noise because the vocal source is independently handled without considering the entire speech synthesis process; where it is inevitable to come up with a mismatch between vocal source and vocal tract filter. To address this problem, we propose an LP-WaveNet that structurally models the vocal source in the speech training and inference processes. The experimental results verify that the proposed system outperforms the conventional WaveNet vocoders both objectively and subjectively.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/05/2019

WaveCycleGAN2: Time-domain Neural Post-filter for Speech Waveform Generation

WaveCycleGAN has recently been proposed to bridge the gap between natura...
research
06/07/2020

Maximum Phase Modeling for Sparse Linear Prediction of Speech

Linear prediction (LP) is an ubiquitous analysis method in speech proces...
research
12/30/2019

Using a Pitch-Synchronous Residual Codebook for Hybrid HMM/Frame Selection Speech Synthesis

This paper proposes a method to improve the quality delivered by statist...
research
02/23/2022

End-to-end LPCNet: A Neural Vocoder With Fully-Differentiable LPC Estimation

Neural vocoders have recently demonstrated high quality speech synthesis...
research
11/09/2018

ExcitNet vocoder: A neural excitation model for parametric speech synthesis systems

This paper proposes a WaveNet-based neural excitation model (ExcitNet) f...
research
04/27/2019

Neural source-filter waveform models for statistical parametric speech synthesis

Neural waveform models such as WaveNet have demonstrated better performa...
research
11/15/2018

Comprehensive evaluation of statistical speech waveform synthesis

Statistical TTS systems that directly predict the speech waveform have r...

Please sign up or login with your details

Forgot password? Click here to reset