Learning the Beauty in Songs: Neural Singing Voice Beautifier

02/27/2022
by   Jinglin Liu, et al.
18

We are interested in a novel task, singing voice beautifying (SVB). Given the singing voice of an amateur singer, SVB aims to improve the intonation and vocal tone of the voice, while keeping the content and vocal timbre. Current automatic pitch correction techniques are immature, and most of them are restricted to intonation but ignore the overall aesthetic quality. Hence, we introduce Neural Singing Voice Beautifier (NSVB), the first generative model to solve the SVB task, which adopts a conditional variational autoencoder as the backbone and learns the latent representations of vocal tone. In NSVB, we propose a novel time-warping approach for pitch correction: Shape-Aware Dynamic Time Warping (SADTW), which ameliorates the robustness of existing time-warping approaches, to synchronize the amateur recording with the template pitch curve. Furthermore, we propose a latent-mapping algorithm in the latent space to convert the amateur vocal tone to the professional one. To achieve this, we also propose a new dataset containing parallel singing recordings of both amateur and professional versions. Extensive experiments on both Chinese and English songs demonstrate the effectiveness of our methods in terms of both objective and subjective metrics. Audio samples are available at <https://neuralsvb.github.io>. Codes: <https://github.com/MoonInTheRiver/NeuralSVB>.

READ FULL TEXT

page 3

page 7

page 11

page 12

research
03/12/2021

Latent Space Explorations of Singing Voice Synthesis using DDSP

Machine learning based singing voice models require large datasets and l...
research
10/28/2020

PPG-based singing voice conversion with adversarial representation learning

Singing voice conversion (SVC) aims to convert the voice of one singer t...
research
07/12/2023

Rhythm Modeling for Voice Conversion

Voice conversion aims to transform source speech into a different target...
research
05/18/2023

RMSSinger: Realistic-Music-Score based Singing Voice Synthesis

We are interested in a challenging task, Realistic-Music-Score based Sin...
research
10/09/2020

Baseline System of Voice Conversion Challenge 2020 with Cyclic Variational Autoencoder and Parallel WaveGAN

In this paper, we present a description of the baseline system of Voice ...
research
06/08/2021

NWT: Towards natural audio-to-video generation with representation learning

In this work we introduce NWT, an expressive speech-to-video model. Unli...
research
05/04/2023

Idiolect: A Reconfigurable Voice Coding Assistant

This paper presents Idiolect, an open source (https://github.com/OpenASR...

Please sign up or login with your details

Forgot password? Click here to reset