Receptive Field Analysis of Temporal Convolutional Networks for Monaural Speech Dereverberation

04/13/2022
by   William Ravenscroft, et al.
0

Speech dereverberation is often an important requirement in robust speech processing tasks. Supervised deep learning (DL) models give state-of-the-art performance for single-channel speech dereverberation. Temporal convolutional networks (TCNs) are commonly used for sequence modelling in speech enhancement tasks. A feature of TCNs is that they have a receptive field (RF) dependant on the specific model configuration which determines the number of input frames that can be observed to produce an individual output frame. It has been shown that TCNs are capable of performing dereverberation of simulated speech data, however a thorough analysis, especially with focus on the RF is yet lacking in the literature. This paper analyses dereverberation performance depending on the model size and the RF of TCNs. Experiments using the WHAMR corpus which is extended to include room impulse responses (RIRs) with larger T60 values demonstrate that a larger RF can have significant improvement in performance when training smaller TCN models. It is also demonstrated that TCNs benefit from a wider RF when dereverberating RIRs with larger RT60 values.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/27/2022

Deformable Temporal Convolutional Networks for Monaural Noisy Reverberant Speech Separation

Speech separation models are used for isolating individual speakers in m...
research
05/17/2022

Utterance Weighted Multi-Dilation Temporal Convolutional Networks for Monaural Speech Dereverberation

Speech dereverberation is an important stage in many speech technology a...
research
09/05/2019

Receptive-field-regularized CNN variants for acoustic scene classification

Acoustic scene classification and related tasks have been dominated by C...
research
07/03/2019

The Receptive Field as a Regularizer in Deep Convolutional Neural Networks for Acoustic Scene Classification

Convolutional Neural Networks (CNNs) have had great success in many mach...
research
04/06/2022

FFC-SE: Fast Fourier Convolution for Speech Enhancement

Fast Fourier convolution (FFC) is the recently proposed neural operator ...
research
02/02/2014

Collaborative Receptive Field Learning

The challenge of object categorization in images is largely due to arbit...
research
05/14/2021

Predicting speech intelligibility from EEG using a dilated convolutional network

Objective: Currently, only behavioral speech understanding tests are ava...

Please sign up or login with your details

Forgot password? Click here to reset