Towards Universal Neural Vocoding with a Multi-band Excited WaveNet

10/07/2021
by   Axel Roebel, et al.
0

This paper introduces the Multi-Band Excited WaveNet a neural vocoder for speaking and singing voices. It aims to advance the state of the art towards an universal neural vocoder, which is a model that can generate voice signals from arbitrary mel spectrograms extracted from voice signals. Following the success of the DDSP model and following the development of the recently proposed excitation vocoders we propose a vocoder structure consisting of multiple specialized DNN that are combined with dedicated signal processing components. All components are implemented as differentiable operators and therefore allow joined optimization of the model parameters. To prove the capacity of the model to reproduce high quality voice signals we evaluate the model on single and multi speaker/singer datasets. We conduct a subjective evaluation demonstrating that the models support a wide range of domain variations (unseen voices, languages, expressivity) achieving perceptive quality that compares with a state of the art universal neural vocoder, however using significantly smaller training datasets and significantly less parameters. We also demonstrate remaining limits of the universality of neural vocoders e.g. the creation of saturated singing voices.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/24/2017

Deep Voice 2: Multi-Speaker Neural Text-to-Speech

We introduce a technique for augmenting neural text-to-speech (TTS) with...
research
06/29/2023

Singing Voice Synthesis Using Differentiable LPC and Glottal-Flow-Inspired Wavetables

This paper introduces GlOttal-flow LPC Filter (GOLF), a novel method for...
research
05/12/2020

FeatherWave: An efficient high-fidelity neural vocoder with multi-band linear prediction

In this paper, we propose the FeatherWave, yet another variant of WaveRN...
research
11/22/2022

TF-GridNet: Integrating Full- and Sub-Band Modeling for Speech Separation

We propose TF-GridNet for speech separation. The model is a novel multi-...
research
11/02/2016

The Intelligent Voice 2016 Speaker Recognition System

This paper presents the Intelligent Voice (IV) system submitted to the N...
research
02/01/2021

Universal Neural Vocoding with Parallel WaveNet

We present a universal neural vocoder based on Parallel WaveNet, with an...
research
09/01/2023

Enhancing the vocal range of single-speaker singing voice synthesis with melody-unsupervised pre-training

The single-speaker singing voice synthesis (SVS) usually underperforms a...

Please sign up or login with your details

Forgot password? Click here to reset