WaveFlow: A Compact Flow-based Model for Raw Audio

12/03/2019
by   Wei Ping, et al.
0

In this work, we present WaveFlow, a small-footprint generative flow for raw audio, which is trained with maximum likelihood without probability density distillation and auxiliary losses as used in Parallel WaveNet and ClariNet. It provides a unified view of likelihood-based models for raw audio, including WaveNet and WaveGlow as special cases. We systematically study these likelihood-based generative models for raw waveforms in terms of test likelihood and speech fidelity. We demonstrate that WaveFlow can synthesize high-fidelity speech as WaveNet, while only requiring a few sequential steps to generate very long waveforms with hundreds of thousands of time-steps. Furthermore, WaveFlow closes the significant likelihood gap that has existed between autoregressive models and flow-based models for efficient synthesis. Finally, our small-footprint WaveFlow has 5.91M parameters and can generate 22.05kHz high-fidelity speech 42.6 times faster than real-time on a GPU without engineered inference kernels.

READ FULL TEXT
research
11/06/2018

FloWaveNet : A Generative Flow for Raw Audio

Most of modern text-to-speech architectures use a WaveNet vocoder for sy...
research
06/08/2020

WaveNODE: A Continuous Normalizing Flow for Speech Synthesis

In recent years, various flow-based generative models have been proposed...
research
11/28/2017

Parallel WaveNet: Fast High-Fidelity Speech Synthesis

The recently-developed WaveNet architecture is the current state of the ...
research
09/27/2021

FlowVocoder: A small Footprint Neural Vocoder based Normalizing flow for Speech Synthesis

Recently, non-autoregressive neural vocoders have provided remarkable pe...
research
09/02/2020

WaveGrad: Estimating Gradients for Waveform Generation

This paper introduces WaveGrad, a conditional model for waveform generat...
research
08/03/2020

A Spectral Energy Distance for Parallel Speech Synthesis

Speech synthesis is an important practical generative modeling problem t...
research
06/07/2021

Learning to Efficiently Sample from Diffusion Probabilistic Models

Denoising Diffusion Probabilistic Models (DDPMs) have emerged as a power...

Please sign up or login with your details

Forgot password? Click here to reset