MelGlow: Efficient Waveform Generative Network Based on Location-Variable Convolution

12/03/2020
by   Zhen Zeng, et al.
0

Recent neural vocoders usually use a WaveNet-like network to capture the long-term dependencies of the waveform, but a large number of parameters are required to obtain good modeling capabilities. In this paper, an efficient network, named location-variable convolution, is proposed to model the dependencies of waveforms. Different from the use of unified convolution kernels in WaveNet to capture the dependencies of arbitrary waveforms, location-variable convolutions utilizes a kernel predictor to generate multiple sets of convolution kernels based on the mel-spectrum, where each set of convolution kernels is used to perform convolution operations on the associated waveform intervals. Combining WaveGlow and location-variable convolutions, an efficient vocoder, named MelGlow, is designed. Experiments on the LJSpeech dataset show that MelGlow achieves better performance than WaveGlow at small model sizes, which verifies the effectiveness and potential optimization space of location-variable convolutions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/11/2019

HetConv: Heterogeneous Kernel-Based Convolutions for Deep CNNs

We present a novel deep learning architecture in which the convolution o...
research
02/27/2020

XSepConv: Extremely Separated Convolution

Depthwise convolution has gradually become an indispensable operation fo...
research
05/24/2019

Generative Flow via Invertible nxn Convolution

Flow-based generative models have recently become one of the most effici...
research
09/26/2021

Group Shift Pointwise Convolution for Volumetric Medical Image Segmentation

Recent studies have witnessed the effectiveness of 3D convolutions on se...
research
12/10/2018

Reliable Identification of Redundant Kernels for Convolutional Neural Network Compression

To compress deep convolutional neural networks (CNNs) with large memory ...
research
08/10/2023

Temporally-Adaptive Models for Efficient Video Understanding

Spatial convolutions are extensively used in numerous deep video models....
research
11/26/2020

Positive definiteness of real quadratic forms resulting from the variable-step approximation of convolution operators

The positive definiteness of real quadratic forms with convolution struc...

Please sign up or login with your details

Forgot password? Click here to reset