Using recurrences in time and frequency within U-net architecture for speech enhancement

11/16/2018
by   Tomasz Grzywalski, et al.
0

When designing fully-convolutional neural network, there is a trade-off between receptive field size, number of parameters and spatial resolution of features in deeper layers of the network. In this work we present a novel network design based on combination of many convolutional and recurrent layers that solves these dilemmas. We compare our solution with U-nets based models known from the literature and other baseline models on speech enhancement task. We test our solution on TIMIT speech utterances combined with noise segments extracted from NOISEX-92 database and show clear advantage of proposed solution in terms of SDR (signal-to-distortion ratio), SIR (signal-to-interference ratio) and STOI (spectro-temporal objective intelligibility) metrics compared to the current state-of-the-art.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/08/2023

Convolutional Recurrent Neural Network with Attention for 3D Speech Enhancement

3D speech enhancement can effectively improve the auditory experience an...
research
02/03/2021

Monaural Speech Enhancement with Complex Convolutional Block Attention Module and Joint Time Frequency Losses

Deep complex U-Net structure and convolutional recurrent network (CRN) s...
research
04/15/2019

RHR-Net: A Residual Hourglass Recurrent Neural Network for Speech Enhancement

Most current speech enhancement models use spectrogram features that req...
research
02/05/2021

Real-time Denoising and Dereverberation with Tiny Recurrent U-Net

Modern deep learning-based models have seen outstanding performance impr...
research
06/01/2023

A Multi-dimensional Deep Structured State Space Approach to Speech Enhancement Using Small-footprint Models

We propose a multi-dimensional structured state space (S4) approach to s...
research
04/27/2017

Complex spectrogram enhancement by convolutional neural network with multi-metrics learning

This paper aims to address two issues existing in the current speech enh...
research
08/18/2019

Efficient Context Aggregation for End-to-End Speech Enhancement Using a Densely Connected Convolutional and Recurrent Network

In speech enhancement, an end-to-end deep neural network converts a nois...

Please sign up or login with your details

Forgot password? Click here to reset