PeriodNet: A non-autoregressive waveform generation model with a structure separating periodic and aperiodic components

02/15/2021
by   Yukiya Hono, et al.
0

We propose PeriodNet, a non-autoregressive (non-AR) waveform generation model with a new model structure for modeling periodic and aperiodic components in speech waveforms. The non-AR waveform generation models can generate speech waveforms parallelly and can be used as a speech vocoder by conditioning an acoustic feature. Since a speech waveform contains periodic and aperiodic components, both components should be appropriately modeled to generate a high-quality speech waveform. However, it is difficult to decompose the components from a natural speech waveform in advance. To address this issue, we propose a parallel model and a series model structure separating periodic and aperiodic components. The features of our proposed models are that explicit periodic and aperiodic signals are taken as input, and external periodic/aperiodic decomposition is not needed in training. Experiments using a singing voice corpus show that our proposed structure improves the naturalness of the generated waveform. We also show that the speech waveforms with a pitch outside of the training data range can be generated with more naturalness.

READ FULL TEXT

page 3

page 4

research
10/29/2018

Neural source-filter-based waveform model for statistical parametric speech synthesis

Neural waveform models such as the WaveNet are used in many recent text-...
research
08/27/2019

Neural Harmonic-plus-Noise Waveform Model with Trainable Maximum Voice Frequency for Text-to-Speech Synthesis

Neural source-filter (NSF) models are deep neural networks that produce ...
research
10/24/2021

Conditional Generation of Periodic Signals with Fourier-Based Decoder

Periodic signals play an important role in daily lives. Although convent...
research
04/12/2017

A Neural Parametric Singing Synthesizer

We present a new model for singing synthesis based on a modified version...
research
07/11/2020

Quasi-Periodic WaveNet: An Autoregressive Raw Waveform Generative Model with Pitch-dependent Dilated Convolution Neural Network

In this paper, a pitch-adaptive waveform generative model named Quasi-Pe...
research
07/25/2020

Quasi-Periodic Parallel WaveGAN: A Non-autoregressive Raw Waveform Generative Model with Pitch-dependent Dilated Convolution Neural Network

In this paper, we propose a quasi-periodic parallel WaveGAN (QPPWG) wave...
research
10/26/2020

TTS-by-TTS: TTS-driven Data Augmentation for Fast and High-Quality Speech Synthesis

In this paper, we propose a text-to-speech (TTS)-driven data augmentatio...

Please sign up or login with your details

Forgot password? Click here to reset