Quasi-Periodic WaveNet: An Autoregressive Raw Waveform Generative Model with Pitch-dependent Dilated Convolution Neural Network

07/11/2020
by   Yi-Chiao Wu, et al.
0

In this paper, a pitch-adaptive waveform generative model named Quasi-Periodic WaveNet (QPNet) is proposed to improve the pitch controllability of vanilla WaveNet (WN) using pitch-dependent dilated convolution neural networks (PDCNNs). Specifically, as a probabilistic autoregressive generation model with stacked dilated convolution layers, WN achieves high-fidelity audio waveform generation. However, the pure-data-driven nature and the lack of prior knowledge of audio signals degrade the pitch controllability of WN. For instance, it is difficult for WN to precisely generate the periodic components of audio signals when the given auxiliary fundamental frequency (F0) features are outside the F0 range observed in the training data. To address this problem, QPNet with two novel designs is proposed. First, the PDCNN component is applied to dynamically change the network architecture of WN according to the given auxiliary F0 features. Second, a cascaded network structure is utilized to simultaneously model the long- and short-term dependences of quasi-periodic signals such as speech. The performances of single-tone sinusoid and speech generations are evaluated. The experimental results show the effectiveness of the PDCNNs for unseen auxiliary F0 features and the effectiveness of the cascaded structure for speech generation.

READ FULL TEXT

page 1

page 7

research
07/25/2020

Quasi-Periodic Parallel WaveGAN: A Non-autoregressive Raw Waveform Generative Model with Pitch-dependent Dilated Convolution Neural Network

In this paper, we propose a quasi-periodic parallel WaveGAN (QPPWG) wave...
research
07/01/2019

Quasi-Periodic WaveNet Vocoder: A Pitch Dependent Dilated Convolution Model for Parametric Speech Generation

In this paper, we propose a quasi-periodic neural network (QPNet) vocode...
research
05/18/2020

Quasi-Periodic Parallel WaveGAN Vocoder: A Non-autoregressive Pitch-dependent Dilated Convolution Model for Parametric Speech Generation

In this paper, we propose a parallel WaveGAN (PWG)-like neural vocoder w...
research
02/15/2021

PeriodNet: A non-autoregressive waveform generation model with a structure separating periodic and aperiodic components

We propose PeriodNet, a non-autoregressive (non-AR) waveform generation ...
research
07/21/2019

Statistical Voice Conversion with Quasi-Periodic WaveNet Vocoder

In this paper, we investigate the effectiveness of a quasi-periodic Wave...
research
04/10/2021

Unified Source-Filter GAN: Unified Source-filter Network Based On Factorization of Quasi-Periodic Parallel WaveGAN

We propose a unified approach to data-driven source-filter modeling usin...
research
06/11/2021

Catch-A-Waveform: Learning to Generate Audio from a Single Short Example

Models for audio generation are typically trained on hours of recordings...

Please sign up or login with your details

Forgot password? Click here to reset