Latent Space Explorations of Singing Voice Synthesis using DDSP

03/12/2021
by   Juan Alonso, et al.
11

Machine learning based singing voice models require large datasets and lengthy training times. In this work we present a lightweight architecture, based on the Differentiable Digital Signal Processing (DDSP) library, that is able to output song-like utterances conditioned only on pitch and amplitude, after twelve hours of training using small datasets of unprocessed audio. The results are promising, as both the melody and the singer's voice are recognizable. In addition, we present two zero-configuration tools to train new models and experiment with them. Currently we are exploring the latent space representation, which is included in the DDSP library, but not in the original DDSP examples. Our results indicate that the latent space improves both the identification of the singer as well as the comprehension of the lyrics. Our code is available at https://github.com/juanalonso/DDSP-singing-experiments with links to the zero-configuration notebooks, and our sound examples are at https://juanalonso.github.io/DDSP-singing-experiments/ .

READ FULL TEXT

page 3

page 4

page 6

page 9

page 10

research
06/29/2023

Singing Voice Synthesis Using Differentiable LPC and Glottal-Flow-Inspired Wavetables

This paper introduces GlOttal-flow LPC Filter (GOLF), a novel method for...
research
02/27/2022

Learning the Beauty in Songs: Neural Singing Voice Beautifier

We are interested in a novel task, singing voice beautifying (SVB). Give...
research
07/10/2018

DLOPT: Deep Learning Optimization Library

Deep learning hyper-parameter optimization is a tough task. Finding an a...
research
03/12/2021

Real-time Timbre Transfer and Sound Synthesis using DDSP

Neural audio synthesis is an actively researched topic, having yielded a...
research
07/04/2023

Disentanglement in a GAN for Unconditional Speech Synthesis

Can we develop a model that can synthesize realistic speech directly fro...
research
07/17/2023

Latent Space Representations of Neural Algorithmic Reasoners

Neural Algorithmic Reasoning (NAR) is a research area focused on designi...
research
01/14/2020

DDSP: Differentiable Digital Signal Processing

Most generative models of audio directly generate samples in one of two ...

Please sign up or login with your details

Forgot password? Click here to reset