Efficient Multi-Channel Speech Enhancement with Spherical Harmonics Injection for Directional Encoding

09/19/2023
by   Jiahui Pan, et al.
0

Multi-channel speech enhancement extracts speech using multiple microphones that capture spatial cues. Effectively utilizing directional information is key for multi-channel enhancement. Deep learning shows great potential on multi-channel speech enhancement and often takes short-time Fourier Transform (STFT) as inputs directly. To fully leverage the spatial information, we introduce a method using spherical harmonics transform (SHT) coefficients as auxiliary model inputs. These coefficients concisely represent spatial distributions. Specifically, our model has two encoders, one for the STFT and another for the SHT. By fusing both encoders in the decoder to estimate the enhanced STFT, we effectively incorporate spatial context. Evaluations on TIMIT under varying noise and reverberation show our model outperforms established benchmarks. Remarkably, this is achieved with fewer computations and parameters. By leveraging spherical harmonics to incorporate directional cues, our model efficiently improves the performance of the multi-channel speech enhancement.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/19/2023

Hierarchical Modeling of Spatial Cues via Spherical Harmonics for Multi-Channel Speech Enhancement

Multi-channel speech enhancement utilizes spatial information from multi...
research
03/13/2023

Guided Speech Enhancement Network

High quality speech capture has been widely studied for both voice commu...
research
09/19/2023

PDPCRN: Parallel Dual-Path CRN with Bi-directional Inter-Branch Interactions for Multi-Channel Speech Enhancement

Multi-channel speech enhancement seeks to utilize spatial information to...
research
07/17/2022

Multi-channel target speech enhancement based on ERB-scaled spatial coherence features

Recently, speech enhancement technologies that are based on deep learnin...
research
12/09/2021

A Training Framework for Stereo-Aware Speech Enhancement using Deep Neural Networks

Deep learning-based speech enhancement has shown unprecedented performan...
research
04/22/2021

Nonlinear Spatial Filtering in Multichannel Speech Enhancement

The majority of multichannel speech enhancement algorithms are two-step ...
research
11/08/2021

Inter-channel Conv-TasNet for multichannel speech enhancement

Speech enhancement in multichannel settings has been realized by utilizi...

Please sign up or login with your details

Forgot password? Click here to reset