Source-Filter-Based Generative Adversarial Neural Vocoder for High Fidelity Speech Synthesis

04/26/2023
by   Ye-Xin Lu, et al.
0

This paper proposes a source-filter-based generative adversarial neural vocoder named SF-GAN, which achieves high-fidelity waveform generation from input acoustic features by introducing F0-based source excitation signals to a neural filter framework. The SF-GAN vocoder is composed of a source module and a resolution-wise conditional filter module and is trained based on generative adversarial strategies. The source module produces an excitation signal from the F0 information, then the resolution-wise convolutional filter module combines the excitation signal with processed acoustic features at various temporal resolutions and finally reconstructs the raw waveform. The experimental results show that our proposed SF-GAN vocoder outperforms the state-of-the-art HiFi-GAN and Fre-GAN in both analysis-synthesis (AS) and text-to-speech (TTS) tasks, and the synthesized speech quality of SF-GAN is comparable to the ground-truth audio.

READ FULL TEXT

page 8

page 10

research
10/27/2022

Source-Filter HiFi-GAN: Fast and Pitch Controllable High-Fidelity Neural Vocoder

Our previous work, the unified source-filter GAN (uSFGAN) vocoder, intro...
research
11/02/2022

DSPGAN: a GAN-based universal vocoder for high-fidelity TTS by time-frequency domain supervision from DSP

Recent development of neural vocoders based on the generative adversaria...
research
06/04/2021

Fre-GAN: Adversarial Frequency-consistent Audio Synthesis

Although recent works on neural vocoder have improved the quality of syn...
research
11/03/2020

StyleMelGAN: An Efficient High-Fidelity Adversarial Vocoder with Temporal Adaptive Normalization

In recent years, neural vocoders have surpassed classical speech generat...
research
07/30/2020

VocGAN: A High-Fidelity Real-time Vocoder with a Hierarchically-nested Adversarial Network

We present a novel high-fidelity real-time neural vocoder called VocGAN....
research
08/09/2022

DDSP-based Singing Vocoders: A New Subtractive-based Synthesizer and A Comprehensive Evaluation

A vocoder is a conditional audio generation model that converts acoustic...
research
12/06/2018

Generative Adversarial Network based Speaker Adaptation for High Fidelity WaveNet Vocoder

Neural networks based vocoders, typically the WaveNet, have achieved spe...

Please sign up or login with your details

Forgot password? Click here to reset