Vocoder-Free Non-Parallel Conversion of Whispered Speech With Masked Cycle-Consistent Generative Adversarial Networks

06/10/2023
by   Dominik Wagner, et al.
0

Cycle-consistent generative adversarial networks have been widely used in non-parallel voice conversion (VC). Their ability to learn mappings between source and target features without relying on parallel training data eliminates the need for temporal alignments. However, most methods decouple the conversion of acoustic features from synthesizing the audio signal by using separate models for conversion and waveform synthesis. This work unifies conversion and synthesis into a single model, thereby eliminating the need for a separate vocoder. By leveraging cycle-consistent training and a self-supervised auxiliary training task, our model is able to efficiently generate converted high-quality raw audio waveforms. Subjective listening tests show that our method outperforms the baseline in whispered speech conversion (up to 6.7 relative improvement), and mean opinion score predictions yield competitive results in conventional VC (between 0.5

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/02/2018

High-quality nonparallel voice conversion based on cycle-consistent adversarial network

Although voice conversion (VC) algorithms have achieved remarkable succe...
research
03/27/2020

Mic2Mic: Using Cycle-Consistent Generative Adversarial Networks to Overcome Microphone Variability in Speech Systems

Mobile and embedded devices are increasingly using microphones and audio...
research
11/30/2017

Parallel-Data-Free Voice Conversion Using Cycle-Consistent Adversarial Networks

We propose a parallel-data-free voice conversion (VC) method that can le...
research
08/27/2020

Non-Parallel Voice Conversion with Augmented Classifier Star Generative Adversarial Networks

We have previously proposed a method that allows for non-parallel voice ...
research
10/22/2020

CycleGAN-VC3: Examining and Improving CycleGAN-VCs for Mel-spectrogram Conversion

Non-parallel voice conversion (VC) is a technique for learning mappings ...
research
02/25/2021

MaskCycleGAN-VC: Learning Non-parallel Voice Conversion with Filling in Frames

Non-parallel voice conversion (VC) is a technique for training voice con...
research
06/06/2018

StarGAN-VC: Non-parallel many-to-many voice conversion with star generative adversarial networks

This paper proposes a method that allows for non-parallel many-to-many v...

Please sign up or login with your details

Forgot password? Click here to reset