Microphone Array Generalization for Multichannel Narrowband Deep Speech Enhancement

07/27/2021
by   Siyuan Zhang, et al.
0

This paper addresses the problem of microphone array generalization for deep-learning-based end-to-end multichannel speech enhancement. We aim to train a unique deep neural network (DNN) potentially performing well on unseen microphone arrays. The microphone array geometry shapes the network's parameters when training on a fixed microphone array, and thus restricts the generalization of the trained network to another microphone array. To resolve this problem, a single network is trained using data recorded by various microphone arrays of different geometries. We design three variants of our recently proposed narrowband network to cope with the agnostic number of microphones. Overall, the goal is to make the network learn the universal information for speech enhancement that is available for any array geometry, rather than learn the one-array-dedicated characteristics. The experiments on both simulated and real room impulse responses (RIR) demonstrate the excellent across-array generalization capability of the proposed networks, in the sense that their performance measures are very close to, or even exceed the network trained with test arrays. Moreover, they notably outperform various beamforming methods and other advanced deep-learning-based methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/17/2022

Multi-channel target speech enhancement based on ERB-scaled spatial coherence features

Recently, speech enhancement technologies that are based on deep learnin...
research
10/20/2021

One model to enhance them all: array geometry agnostic multi-channel personalized speech enhancement

With the recent surge of video conferencing tools usage, providing high-...
research
10/22/2020

Position-Agnostic Multi-Microphone Speech Dereverberation

Neural networks (NNs) have been widely applied in speech processing task...
research
11/03/2018

Deep Ad-hoc Beamforming

Deep learning based speech enhancement methods face two problems. First,...
research
11/16/2022

Array Configuration-Agnostic Personalized Speech Enhancement using Long-Short-Term Spatial Coherence

Personalized speech enhancement has been a field of active research for ...
research
10/22/2021

TADRN: Triple-Attentive Dual-Recurrent Network for Ad-hoc Array Multichannel Speech Enhancement

Deep neural networks (DNNs) have been successfully used for multichannel...
research
08/28/2023

Data-driven 3D Room Geometry Inference with a Linear Loudspeaker Array and a Single Microphone

Knowing the room geometry may be very beneficial for many audio applicat...

Please sign up or login with your details

Forgot password? Click here to reset