Boosting Star-GANs for Voice Conversion with Contrastive Discriminator

09/21/2022
by   Shijing Si, et al.
0

Nonparallel multi-domain voice conversion methods such as the StarGAN-VCs have been widely applied in many scenarios. However, the training of these models usually poses a challenge due to their complicated adversarial network architectures. To address this, in this work we leverage the state-of-the-art contrastive learning techniques and incorporate an efficient Siamese network structure into the StarGAN discriminator. Our method is called SimSiam-StarGAN-VC and it boosts the training stability and effectively prevents the discriminator overfitting issue in the training process. We conduct experiments on the Voice Conversion Challenge (VCC 2018) dataset, plus a user study to validate the performance of our framework. Our experimental results show that SimSiam-StarGAN-VC significantly outperforms existing StarGAN-VC methods in terms of both the objective and subjective metrics.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/02/2020

CVC: Contrastive Learning for Non-parallel Voice Conversion

Cycle consistent generative adversarial network (CycleGAN) and variation...
research
10/22/2019

SoftGAN: Learning generative models efficiently with application to CycleGAN Voice Conversion

Voice conversion with deep neural networks has become extremely popular ...
research
01/26/2022

Invertible Voice Conversion

In this paper, we propose an invertible deep learning framework called I...
research
10/20/2022

Robust One-Shot Singing Voice Conversion

Many existing works on singing voice conversion (SVC) require clean reco...
research
06/26/2023

The Singing Voice Conversion Challenge 2023

We present the latest iteration of the voice conversion challenge (VCC) ...

Please sign up or login with your details

Forgot password? Click here to reset