Utterance Weighted Multi-Dilation Temporal Convolutional Networks for Monaural Speech Dereverberation

05/17/2022
by   William Ravenscroft, et al.
0

Speech dereverberation is an important stage in many speech technology applications. Recent work in this area has been dominated by deep neural network models. Temporal convolutional networks (TCNs) are deep learning models that have been proposed for sequence modelling in the task of dereverberating speech. In this work a weighted multi-dilation depthwise-separable convolution is proposed to replace standard depthwise-separable convolutions in TCN models. This proposed convolution enables the TCN to dynamically focus on more or less local information in its receptive field at each convolutional block in the network. It is shown that this weighted multi-dilation temporal convolutional network (WD-TCN) consistently outperforms the TCN across various model configurations and using the WD-TCN model is a more parameter efficient method to improve the performance of the model than increasing the number of convolutional blocks. The best performance improvement over the baseline TCN is 0.55 dB scale-invariant signal-to-distortion ratio (SISDR) and the best performing WD-TCN model attains 12.26 dB SISDR on the WHAMR dataset.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/27/2022

Deformable Temporal Convolutional Networks for Monaural Noisy Reverberant Speech Separation

Speech separation models are used for isolating individual speakers in m...
research
05/21/2020

Formant Tracking Using Dilated Convolutional Networks Through Dense Connection with Gating Mechanism

Formant tracking is one of the most fundamental problems in speech proce...
research
04/13/2022

Receptive Field Analysis of Temporal Convolutional Networks for Monaural Speech Dereverberation

Speech dereverberation is often an important requirement in robust speec...
research
02/12/2019

FurcaNeXt: End-to-end monaural speech separation with dynamic gated dilated temporal convolutional networks

Deep dilated temporal convolutional networks (TCN) have been proved to b...
research
04/04/2019

Sequence-to-Sequence Speech Recognition with Time-Depth Separable Convolutions

We propose a fully convolutional sequence-to-sequence encoder architectu...
research
06/09/2023

Domestic Activities Classification from Audio Recordings Using Multi-scale Dilated Depthwise Separable Convolutional Network

Domestic activities classification (DAC) from audio recordings aims at c...
research
09/04/2019

Deep Convolutional Networks in System Identification

Recent developments within deep learning are relevant for nonlinear syst...

Please sign up or login with your details

Forgot password? Click here to reset