SkipConvNet: Skip Convolutional Neural Network for Speech Dereverberation using Optimally Smoothed Spectral Mapping

by   Vinay Kothapally, et al.

The reliability of using fully convolutional networks (FCNs) has been successfully demonstrated by recent studies in many speech applications. One of the most popular variants of these FCNs is the `U-Net', which is an encoder-decoder network with skip connections. In this study, we propose `SkipConvNet' where we replace each skip connection with multiple convolutional modules to provide decoder with intuitive feature maps rather than encoder's output to improve the learning capacity of the network. We also propose the use of optimal smoothing of power spectral density (PSD) as a pre-processing step, which helps to further enhance the efficiency of the network. To evaluate our proposed system, we use the REVERB challenge corpus to assess the performance of various enhancement approaches under the same conditions. We focus solely on monitoring improvements in speech quality and their contribution to improving the efficiency of back-end speech systems, such as speech recognition and speaker verification, trained on only clean speech. Experimental findings show that the proposed system consistently outperforms other approaches.


page 2

page 3


Speech Dereverberation Using Fully Convolutional Networks

Speech derverberation using a single microphone is addressed in this pap...

Late reverberation suppression using U-nets

In real-world settings, speech signals are almost always affected by rev...

Complex Spectral Mapping With Attention Based Convolution Recurrent Neural Network for Speech Enhancement

Speech enhancement has benefited from the success of deep learning in te...

Deep Convolutional Neural Network-based Inverse Filtering Approach for Speech De-reverberation

In this paper, we introduce a spectral-domain inverse filtering approach...

Select, Attend, and Transfer: Light, Learnable Skip Connections

Skip connections in deep networks have improved both segmentation and cl...

Multifidelity data fusion in convolutional encoder/decoder networks

We analyze the regression accuracy of convolutional neural networks asse...

Image to Image Translation based on Convolutional Neural Network Approach for Speech Declipping

Clipping, as a current nonlinear distortion, often occurs due to the lim...