SoftGAN: Learning generative models efficiently with application to CycleGAN Voice Conversion

10/22/2019
by   Rafael Ferro, et al.
0

Voice conversion with deep neural networks has become extremely popular over the last few years with improvements over the past VC architectures. In particular, GAN architectures such as the cycleGAN and the VAEGAN have offered the possibility to learn voice conversion from non-parallel databases. However, GAN-based methods are highly unstable, requiring often a careful tuning of hyper-parameters, and can lead to poor voice identity conversion and substantially degraded converted speech signal. This paper discusses and tackles the stability issues of the GAN in the context of voice conversion. The proposed SoftGAN method aims at reducing the impact of the generator on the discriminator and vice versa during training, so both can learn more gradually and efficiently during training, in particular avoiding a training not in tandem. A subjective experiment conducted on a voice conversion task on the voice conversion challenge 2018 dataset shows that the proposed SoftGAN significantly improves the quality of the voice conversion while preserving the naturalness of the converted speech.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/02/2020

CVC: Contrastive Learning for Non-parallel Voice Conversion

Cycle consistent generative adversarial network (CycleGAN) and variation...
research
08/10/2020

VAW-GAN for Singing Voice Conversion with Non-parallel Training Data

Singing voice conversion aims to convert singer's voice from source to t...
research
09/21/2022

Boosting Star-GANs for Voice Conversion with Contrastive Discriminator

Nonparallel multi-domain voice conversion methods such as the StarGAN-VC...
research
12/04/2022

Generative Models for Improved Naturalness, Intelligibility, and Voicing of Whispered Speech

This work adapts two recent architectures of generative models and evalu...
research
02/16/2021

Axial Residual Networks for CycleGAN-based Voice Conversion

We propose a novel architecture and improved training objectives for non...
research
04/15/2021

Towards end-to-end F0 voice conversion based on Dual-GAN with convolutional wavelet kernels

This paper presents a end-to-end framework for the F0 transformation in ...
research
07/26/2021

Beyond Voice Identity Conversion: Manipulating Voice Attributes by Adversarial Learning of Structured Disentangled Representations

Voice conversion (VC) consists of digitally altering the voice of an ind...

Please sign up or login with your details

Forgot password? Click here to reset