Towards Low-distortion Multi-channel Speech Enhancement: The ESPNet-SE Submission to The L3DAS22 Challenge

02/24/2022
by   Yen-Ju Lu, et al.
0

This paper describes our submission to the L3DAS22 Challenge Task 1, which consists of speech enhancement with 3D Ambisonic microphones. The core of our approach combines Deep Neural Network (DNN) driven complex spectral mapping with linear beamformers such as the multi-frame multi-channel Wiener filter. Our proposed system has two DNNs and a linear beamformer in between. Both DNNs are trained to perform complex spectral mapping, using a combination of waveform and magnitude spectrum losses. The estimated signal from the first DNN is used to drive a linear beamformer, and the beamforming result, together with this enhanced signal, are used as extra inputs for the second DNN which refines the estimation. Then, from this new estimated signal, the linear beamformer and second DNN are run iteratively. The proposed method was ranked first in the challenge, achieving, on the evaluation set, a ranking metric of 0.984, versus 0.833 of the challenge baseline.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/01/2021

Leveraging Low-Distortion Target Estimates for Improved Speech Enhancement

A promising approach for multi-microphone speech separation involves two...
research
02/14/2020

Consistency-aware multi-channel speech enhancement using deep neural networks

This paper proposes a deep neural network (DNN)-based multi-channel spee...
research
04/21/2022

STFT-Domain Neural Speech Enhancement with Very Low Algorithmic Latency

Deep learning based speech enhancement in the short-term Fourier transfo...
research
03/04/2020

Multi-Microphone Complex Spectral Mapping for Speech Dereverberation

This study proposes a multi-microphone complex spectral mapping approach...
research
11/16/2022

Leveraging Heteroscedastic Uncertainty in Learning Complex Spectral Mapping for Single-channel Speech Enhancement

Most speech enhancement (SE) models learn a point estimate, and do not m...
research
02/02/2018

Monaural Speech Enhancement using Deep Neural Networks by Maximizing a Short-Time Objective Intelligibility Measure

In this paper we propose a Deep Neural Network (DNN) based Speech Enhanc...
research
02/03/2020

Tensor-to-Vector Regression for Multi-channel Speech Enhancement based on Tensor-Train Network

We propose a tensor-to-vector regression approach to multi-channel speec...

Please sign up or login with your details

Forgot password? Click here to reset