DeepAI AI Chat
Log In Sign Up

Frequency Gating: Improved Convolutional Neural Networks for Speech Enhancement in the Time-Frequency Domain

by   Koen Oostermeijer, et al.

One of the strengths of traditional convolutional neural networks (CNNs) is their inherent translational invariance. However, for the task of speech enhancement in the time-frequency domain, this property cannot be fully exploited due to a lack of invariance in the frequency direction. In this paper we propose to remedy this inefficiency by introducing a method, which we call Frequency Gating, to compute multiplicative weights for the kernels of the CNN in order to make them frequency dependent. Several mechanisms are explored: temporal gating, in which weights are dependent on prior time frames, local gating, whose weights are generated based on a single time frame and the ones adjacent to it, and frequency-wise gating, where each kernel is assigned a weight independent of the input data. Experiments with an autoencoder neural network with skip connections show that both local and frequency-wise gating outperform the baseline and are therefore viable ways to improve CNN-based speech enhancement neural networks. In addition, a loss function based on the extended short-time objective intelligibility score (ESTOI) is introduced, which we show to outperform the standard mean squared error (MSE) loss function.


page 2

page 4


A Perceptual Weighting Filter Loss for DNN Training in Speech Enhancement

Single-channel speech enhancement with deep neural networks (DNNs) has s...

Spatial Frequency Loss for Learning Convolutional Autoencoders

This paper presents a learning method for convolutional autoencoders (CA...

AMRConvNet: AMR-Coded Speech Enhancement Using Convolutional Neural Networks

Speech is converted to digital signals using speech coding for efficient...

Data-driven design of perfect reconstruction filterbank for DNN-based sound source enhancement

We propose a data-driven design method of perfect-reconstruction filterb...

Extending DNN-based Multiplicative Masking to Deep Subband Filtering for Improved Dereverberation

In this paper, we present a scheme for extending deep neural network-bas...