Neural Speech Phase Prediction based on Parallel Estimation Architecture and Anti-Wrapping Losses

11/29/2022
by   Yang Ai, et al.
0

This paper presents a novel speech phase prediction model which predicts wrapped phase spectra directly from amplitude spectra by neural networks. The proposed model is a cascade of a residual convolutional network and a parallel estimation architecture. The parallel estimation architecture is composed of two parallel linear convolutional layers and a phase calculation formula, imitating the process of calculating the phase spectra from the real and imaginary parts of complex spectra and strictly restricting the predicted phase values to the principal value interval. To avoid the error expansion issue caused by phase wrapping, we design anti-wrapping training losses defined between the predicted wrapped phase spectra and natural ones by activating the instantaneous phase error, group delay error and instantaneous angular frequency error using an anti-wrapping function. Experimental results show that our proposed neural speech phase prediction model outperforms the iterative Griffin-Lim algorithm and other neural network-based method, in terms of both reconstructed speech quality and generation speed.

READ FULL TEXT
research
05/13/2023

APNet: An All-Frame-Level Neural Vocoder Incorporating Direct Prediction of Amplitude and Phase Spectra

This paper presents a novel neural vocoder named APNet which reconstruct...
research
06/23/2019

A Neural Vocoder with Hierarchical Generation of Amplitude and Phase Spectra for Statistical Parametric Speech Synthesis

This paper presents a neural vocoder named HiNet which reconstructs spee...
research
08/17/2023

Long-frame-shift Neural Speech Phase Prediction with Spectral Continuity Enhancement and Interpolation Error Compensation

Speech phase prediction, which is a significant research focus in the fi...
research
02/20/2019

A Comprehensive Theory and Variational Framework for Anti-aliasing Sampling Patterns

In this paper, we provide a comprehensive theory of anti-aliasing sampli...
research
08/17/2023

Explicit Estimation of Magnitude and Phase Spectra in Parallel for High-Quality Speech Enhancement

Phase information has a significant impact on speech perceptual quality ...
research
07/10/2018

Phase reconstruction from amplitude spectrograms based on von-Mises-distribution deep neural network

This paper presents a deep neural network (DNN)-based phase reconstructi...
research
04/16/2020

Knowledge-and-Data-Driven Amplitude Spectrum Prediction for Hierarchical Neural Vocoders

In our previous work, we have proposed a neural vocoder called HiNet whi...

Please sign up or login with your details

Forgot password? Click here to reset