Generative Modeling for Low Dimensional Speech Attributes with Neural Spline Flows

03/03/2022
by   Kevin J. Shih, et al.
0

Despite recent advances in generative modeling for text-to-speech synthesis, these models do not yet have the same fine-grained adjustability of pitch-conditioned deterministic models such as FastPitch and FastSpeech2. Pitch information is not only low-dimensional, but also discontinuous, making it particularly difficult to model in a generative setting. Our work explores several techniques for handling the aforementioned issues in the context of Normalizing Flow models. We also find this problem to be very well suited for Neural Spline flows, which is a highly expressive alternative to the more common affine-coupling mechanism in Normalizing Flows.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/10/2019

Neural Spline Flows

A normalizing flow models a complex probability density as an invertible...
research
05/31/2020

The Expressive Power of a Class of Normalizing Flow Models

Normalizing flows have received a great deal of recent attention as they...
research
07/19/2020

Generative Flows with Matrix Exponential

Generative flows models enjoy the properties of tractable exact likeliho...
research
02/23/2023

On the curse of dimensionality for Normalizing Flows

Normalizing Flows have emerged as a powerful brand of generative models,...
research
02/21/2020

Likelihood-free inference of experimental Neutrino Oscillations using Neural Spline Flows

We discuss the application of Neural Spline Flows, a neural density esti...
research
07/07/2021

Universal Approximation for Log-concave Distributions using Well-conditioned Normalizing Flows

Normalizing flows are a widely used class of latent-variable generative ...
research
09/18/2023

The NFLikelihood: an unsupervised DNNLikelihood from Normalizing Flows

We propose the NFLikelihood, an unsupervised version, based on Normalizi...

Please sign up or login with your details

Forgot password? Click here to reset