Reverberation Modeling for Source-Filter-based Neural Vocoder

05/15/2020
by   Yang Ai, et al.
0

This paper presents a reverberation module for source-filter-based neural vocoders that improves the performance of reverberant effect modeling. This module uses the output waveform of neural vocoders as an input and produces a reverberant waveform by convolving the input with a room impulse response (RIR). We propose two approaches to parameterizing and estimating the RIR. The first approach assumes a global time-invariant (GTI) RIR and directly learns the values of the RIR on a training dataset. The second approach assumes an utterance-level time-variant (UTV) RIR, which is invariant within one utterance but varies across utterances, and uses another neural network to predict the RIR values. We add the proposed reverberation module to the phase spectrum predictor (PSP) of a HiNet vocoder and jointly train the model. Experimental results demonstrate that the proposed module was helpful for modeling the reverberation effect and improving the perceived quality of generated reverberant speech. The UTV-RIR was shown to be more robust than the GTI-RIR to unknown reverberation conditions and achieved a perceptually better reverberation effect.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/29/2018

Neural source-filter-based waveform model for statistical parametric speech synthesis

Neural waveform models such as the WaveNet are used in many recent text-...
research
04/10/2021

Unified Source-Filter GAN: Unified Source-filter Network Based On Factorization of Quasi-Periodic Parallel WaveGAN

We propose a unified approach to data-driven source-filter modeling usin...
research
04/27/2019

Neural source-filter waveform models for statistical parametric speech synthesis

Neural waveform models such as WaveNet have demonstrated better performa...
research
06/23/2019

A Neural Vocoder with Hierarchical Generation of Amplitude and Phase Spectra for Statistical Parametric Speech Synthesis

This paper presents a neural vocoder named HiNet which reconstructs spee...
research
09/07/2020

Toward the pre-cocktail party problem with TasTas+

Deep neural network with dual-path bi-directional long short-term memory...
research
11/08/2020

Denoising-and-Dereverberation Hierarchical Neural Vocoder for Robust Waveform Generation

This paper presents a denoising and dereverberation hierarchical neural ...

Please sign up or login with your details

Forgot password? Click here to reset