Source-Filter HiFi-GAN: Fast and Pitch Controllable High-Fidelity Neural Vocoder

10/27/2022
by   Reo Yoneyama, et al.
0

Our previous work, the unified source-filter GAN (uSFGAN) vocoder, introduced a novel architecture based on the source-filter theory into the parallel waveform generative adversarial network to achieve high voice quality and pitch controllability. However, the high temporal resolution inputs result in high computation costs. Although the HiFi-GAN vocoder achieves fast high-fidelity voice generation thanks to the efficient upsampling-based generator architecture, the pitch controllability is severely limited. To realize a fast and pitch-controllable high-fidelity neural vocoder, we introduce the source-filter theory into HiFi-GAN by hierarchically conditioning the resonance filtering network on a well-estimated source excitation information. According to the experimental results, our proposed method outperforms HiFi-GAN and uSFGAN on a singing voice generation in voice quality and synthesis speed on a single CPU. Furthermore, unlike the uSFGAN vocoder, the proposed method can be easily adopted/integrated in real-time applications and end-to-end systems.

READ FULL TEXT
research
04/26/2023

Source-Filter-Based Generative Adversarial Neural Vocoder for High Fidelity Speech Synthesis

This paper proposes a source-filter-based generative adversarial neural ...
research
04/22/2022

Speaking-Rate-Controllable HiFi-GAN Using Feature Interpolation

This paper presents a speaking-rate-controllable HiFi-GAN neural vocoder...
research
10/18/2021

KaraTuner: Towards end to end natural pitch correction for singing voice in karaoke

An automatic pitch correction system typically includes several stages, ...
research
05/12/2022

Unified Source-Filter GAN with Harmonic-plus-Noise Source Excitation Generation

This paper introduces a unified source-filter network with a harmonic-pl...
research
04/10/2021

Unified Source-Filter GAN: Unified Source-filter Network Based On Factorization of Quasi-Periodic Parallel WaveGAN

We propose a unified approach to data-driven source-filter modeling usin...
research
06/24/2021

GAN-MDF: A Method for Multi-fidelity Data Fusion in Digital Twins

The Internet of Things (IoT) collects real-time data of physical systems...
research
06/20/2022

WOLONet: Wave Outlooker for Efficient and High Fidelity Speech Synthesis

Recently, GAN-based neural vocoders such as Parallel WaveGAN, MelGAN, Hi...

Please sign up or login with your details

Forgot password? Click here to reset