Hierarchical Modeling of Spatial Cues via Spherical Harmonics for Multi-Channel Speech Enhancement

09/19/2023
by   Jiahui Pan, et al.
0

Multi-channel speech enhancement utilizes spatial information from multiple microphones to extract the target speech. However, most existing methods do not explicitly model spatial cues, instead relying on implicit learning from multi-channel spectra. To better leverage spatial information, we propose explicitly incorporating spatial modeling by applying spherical harmonic transforms (SHT) to the multi-channel input. In detail, a hierarchical framework is introduced whereby lower order harmonics capturing broader spatial patterns are estimated first, then combined with higher orders to recursively predict finer spatial details. Experiments on TIMIT demonstrate the proposed method can effectively recover target spatial patterns and achieve improved performance over baseline models, using fewer parameters and computations. Explicitly modeling spatial information hierarchically enables more effective multi-channel speech enhancement.

READ FULL TEXT
research
09/19/2023

Efficient Multi-Channel Speech Enhancement with Spherical Harmonics Injection for Directional Encoding

Multi-channel speech enhancement extracts speech using multiple micropho...
research
09/10/2023

Gray Jedi MVDR Post-filtering

Spatial filters can exploit deep-learning-based speech enhancement model...
research
03/13/2023

Guided Speech Enhancement Network

High quality speech capture has been widely studied for both voice commu...
research
10/17/2022

spatial-dccrn: dccrn equipped with frame-level angle feature and hybrid filtering for multi-channel speech enhancement

Recently, multi-channel speech enhancement has drawn much interest due t...
research
07/17/2022

Multi-channel target speech enhancement based on ERB-scaled spatial coherence features

Recently, speech enhancement technologies that are based on deep learnin...
research
09/19/2023

PDPCRN: Parallel Dual-Path CRN with Bi-directional Inter-Branch Interactions for Multi-Channel Speech Enhancement

Multi-channel speech enhancement seeks to utilize spatial information to...

Please sign up or login with your details

Forgot password? Click here to reset