Quasi-Periodic Parallel WaveGAN: A Non-autoregressive Raw Waveform Generative Model with Pitch-dependent Dilated Convolution Neural Network

07/25/2020
by   Yi-Chiao Wu, et al.
0

In this paper, we propose a quasi-periodic parallel WaveGAN (QPPWG) waveform generative model, which applies a quasi-periodic (QP) structure to a parallel WaveGAN (PWG) model using pitch-dependent dilated convolution networks (PDCNNs). PWG is a small-footprint GAN-based raw waveform generative model, whose generation time is much faster than real time because of its compact model and non-autoregressive (non-AR) and non-causal mechanisms. Although PWG achieves high-fidelity speech generation, the generic and simple network architecture lacks pitch controllability for an unseen auxiliary fundamental frequency (F_0) feature such as a scaled F_0. To improve the pitch controllability and speech modeling capability, we apply a QP structure with PDCNNs to PWG, which introduces pitch information to the network by dynamically changing the network architecture corresponding to the auxiliary F_0 feature. Both objective and subjective experimental results show that QPPWG outperforms PWG when the auxiliary F_0 feature is scaled. Moreover, analyses of the intermediate outputs of QPPWG also show better tractability and interpretability of QPPWG, which respectively models spectral and excitation-like signals using the cascaded fixed and adaptive blocks of the QP structure.

READ FULL TEXT

page 1

page 10

page 11

research
07/11/2020

Quasi-Periodic WaveNet: An Autoregressive Raw Waveform Generative Model with Pitch-dependent Dilated Convolution Neural Network

In this paper, a pitch-adaptive waveform generative model named Quasi-Pe...
research
05/18/2020

Quasi-Periodic Parallel WaveGAN Vocoder: A Non-autoregressive Pitch-dependent Dilated Convolution Model for Parametric Speech Generation

In this paper, we propose a parallel WaveGAN (PWG)-like neural vocoder w...
research
07/01/2019

Quasi-Periodic WaveNet Vocoder: A Pitch Dependent Dilated Convolution Model for Parametric Speech Generation

In this paper, we propose a quasi-periodic neural network (QPNet) vocode...
research
02/15/2021

PeriodNet: A non-autoregressive waveform generation model with a structure separating periodic and aperiodic components

We propose PeriodNet, a non-autoregressive (non-AR) waveform generation ...
research
04/10/2021

Unified Source-Filter GAN: Unified Source-filter Network Based On Factorization of Quasi-Periodic Parallel WaveGAN

We propose a unified approach to data-driven source-filter modeling usin...
research
07/21/2019

Statistical Voice Conversion with Quasi-Periodic WaveNet Vocoder

In this paper, we investigate the effectiveness of a quasi-periodic Wave...
research
01/19/2021

Improved parallel WaveGAN vocoder with perceptually weighted spectrogram loss

This paper proposes a spectral-domain perceptual weighting technique for...

Please sign up or login with your details

Forgot password? Click here to reset