WGANSing: A Multi-Voice Singing Voice Synthesizer Based on the Wasserstein-GAN

03/26/2019
by   Pritish Chandna, et al.
0

We present a deep neural network based singing voice synthesizer, inspired by the Deep Convolutions Generative Adversarial Networks (DCGAN) architecture and optimized using the Wasserstein-GAN algorithm. We use vocoder parameters for acoustic modelling, to separate the influence of pitch and timbre. This facilitates the modelling of the large variability of pitch in the singing voice. Our network takes a block of consecutive frame-wise linguistic and fundamental frequency features, along with global singer identity as input and outputs vocoder features. For inference, sequential blocks are concatenated using an overlap-add procedure. We show that the performance of our model is comparable to the state-of-the-art and the original sample using objective metrics and a subjective listening test. We also present examples of the synthesis on a supplementary website and the source code via GitHub.

READ FULL TEXT

page 3

page 4

research
02/22/2021

Anyone GAN Sing

The problem of audio synthesis has been increasingly solved using deep n...
research
09/21/2022

Mandarin Singing Voice Synthesis with Denoising Diffusion Probabilistic Wasserstein GAN

Singing voice synthesis (SVS) is the computer production of a human-like...
research
08/05/2020

Recognition-Synthesis Based Non-Parallel Voice Conversion with Adversarial Learning

This paper presents an adversarial learning method for recognition-synth...
research
12/20/2021

Multi-Singer: Fast Multi-Singer Singing Voice Vocoder With A Large-Scale Corpus

High-fidelity multi-singer singing voice synthesis is challenging for ne...
research
02/19/2018

Voice Impersonation using Generative Adversarial Networks

Voice impersonation is not the same as voice transformation, although th...
research
10/31/2022

VoicePrivacy 2022 System Description: Speaker Anonymization with Feature-matched F0 Trajectories

We introduce a novel method to improve the performance of the VoicePriva...
research
12/29/2019

Glottal Source Processing: from Analysis to Applications

The great majority of current voice technology applications relies on ac...

Please sign up or login with your details

Forgot password? Click here to reset