Many-to-Many Voice Conversion using Conditional Cycle-Consistent Adversarial Networks

02/15/2020
by   Shindong Lee, et al.
0

Voice conversion (VC) refers to transforming the speaker characteristics of an utterance without altering its linguistic contents. Many works on voice conversion require to have parallel training data that is highly expensive to acquire. Recently, the cycle-consistent adversarial network (CycleGAN), which does not require parallel training data, has been applied to voice conversion, showing the state-of-the-art performance. The CycleGAN based voice conversion, however, can be used only for a pair of speakers, i.e., one-to-one voice conversion between two speakers. In this paper, we extend the CycleGAN by conditioning the network on speakers. As a result, the proposed method can perform many-to-many voice conversion among multiple speakers using a single generative adversarial network (GAN). Compared to building multiple CycleGANs for each pair of speakers, the proposed method reduces the computational and spatial cost significantly without compromising the sound quality of the converted voice. Experimental results using the VCC2018 corpus confirm the efficiency of the proposed method.

READ FULL TEXT
research
09/15/2019

Voice Conversion Using Cycle-Consistent Variational Autoencoder

One of the most critical obstacles in voice conversion is the requiremen...
research
04/02/2018

High-quality nonparallel voice conversion based on cycle-consistent adversarial network

Although voice conversion (VC) algorithms have achieved remarkable succe...
research
08/09/2018

Rhythm-Flexible Voice Conversion without Parallel Data Using Cycle-GAN over Phoneme Posteriorgram Sequences

Speaking rate refers to the average number of phonemes within some unit ...
research
11/26/2020

Continuous Conversion of CT Kernel using Switchable CycleGAN with AdaIN

In X-ray computed tomography (CT) reconstruction, different filter kerne...
research
05/09/2019

Adversarially Trained Autoencoders for Parallel-Data-Free Voice Conversion

We present a method for converting the voices between a set of speakers....
research
10/22/2020

Towards Low-Resource StarGAN Voice Conversion using Weight Adaptive Instance Normalization

Many-to-many voice conversion with non-parallel training data has seen s...
research
10/20/2022

Robust One-Shot Singing Voice Conversion

Many existing works on singing voice conversion (SVC) require clean reco...

Please sign up or login with your details

Forgot password? Click here to reset