SkipConvGAN: Monaural Speech Dereverberation using Generative Adversarial Networks via Complex Time-Frequency Masking

11/22/2022
by   Vinay Kothapally, et al.
0

With the advancements in deep learning approaches, the performance of speech enhancing systems in the presence of background noise have shown significant improvements. However, improving the system's robustness against reverberation is still a work in progress, as reverberation tends to cause loss of formant structure due to smearing effects in time and frequency. A wide range of deep learning-based systems either enhance the magnitude response and reuse the distorted phase or enhance complex spectrogram using a complex time-frequency mask. Though these approaches have demonstrated satisfactory performance, they do not directly address the lost formant structure caused by reverberation. We believe that retrieving the formant structure can help improve the efficiency of existing systems. In this study, we propose SkipConvGAN - an extension of our prior work SkipConvNet. The proposed system's generator network tries to estimate an efficient complex time-frequency mask, while the discriminator network aids in driving the generator to restore the lost formant structure. We evaluate the performance of our proposed system on simulated and real recordings of reverberant speech from the single-channel task of the REVERB challenge corpus. The proposed system shows a consistent improvement across multiple room configurations over other deep learning-based generative adversarial frameworks.

READ FULL TEXT

page 1

page 8

page 10

research
12/19/2020

DCCRGAN: Deep Complex Convolution Recurrent Generator Adversarial Network for Speech Enhancement

Generative adversarial network (GAN) still exists some problems in deali...
research
04/17/2021

Multi-Metric Optimization using Generative Adversarial Networks for Near-End Speech Intelligibility Enhancement

The intelligibility of speech severely degrades in the presence of envir...
research
08/19/2019

Propagation Channel Modeling by Deep learning Techniques

Channel, as the medium for the propagation of electromagnetic waves, is ...
research
10/02/2018

Phasebook and Friends: Leveraging Discrete Representations for Source Separation

Deep learning based speech enhancement and source separation systems hav...
research
09/18/2023

Refining DNN-based Mask Estimation using CGMM-based EM Algorithm for Multi-channel Noise Reduction

In this paper, we present a method that allows to further improve speech...
research
08/09/2021

Time-Frequency Localization Using Deep Convolutional Maxout Neural Network in Persian Speech Recognition

In this paper, a CNN-based structure for the time-frequency localization...
research
10/22/2019

Speech-VGG: A deep feature extractor for speech processing

A growing number of studies in the field of speech processing employ fea...

Please sign up or login with your details

Forgot password? Click here to reset