Differentiable WORLD Synthesizer-based Neural Vocoder With Application To End-To-End Audio Style Transfer

08/15/2022
by   Shahan Nercessian, et al.
0

In this paper, we propose a differentiable WORLD synthesizer and demonstrate its use in end-to-end audio style transfer tasks such as (singing) voice conversion and the DDSP timbre transfer task. Accordingly, our baseline differentiable synthesizer has no model parameters, yet it yields adequate synthesis quality. We can extend the baseline synthesizer by appending lightweight black-box postnets which apply further processing to the baseline output in order to improve fidelity. An alternative differentiable approach considers extraction of the source excitation spectrum directly, which can improve naturalness albeit for a narrower class of style transfer applications. The acoustic feature parameterization used by our approaches has the added benefit that it naturally disentangles pitch and timbral information so that they can be modeled separately. Moreover, as there exists a robust means of estimating these acoustic features from monophonic audio sources, it allows for parameter loss terms to be added to an end-to-end objective function, which can help convergence and/or further stabilize (adversarial) training.

READ FULL TEXT
research
02/17/2021

End-to-end lyrics Recognition with Voice to Singing Style Transfer

Automatic transcription of monophonic/polyphonic music is a challenging ...
research
05/19/2022

End-to-End Zero-Shot Voice Style Transfer with Location-Variable Convolutions

Zero-shot voice conversion is becoming an increasingly popular research ...
research
11/29/2017

Time Domain Neural Audio Style Transfer

A recently published method for audio style transfer has shown how to ex...
research
04/17/2019

Neural Painters: A learned differentiable constraint for generating brushstroke paintings

We explore neural painters, a generative model for brushstrokes learned ...
research
07/18/2022

Style Transfer of Audio Effects with Differentiable Signal Processing

We present a framework that can impose the audio effects and production ...
research
05/19/2021

Tool- and Domain-Agnostic Parameterization of Style Transfer Effects Leveraging Pretrained Perceptual Metrics

Current deep learning techniques for style transfer would not be optimal...
research
11/19/2021

Differentiable Wavetable Synthesis

Differentiable Wavetable Synthesis (DWTS) is a technique for neural audi...

Please sign up or login with your details

Forgot password? Click here to reset