Array Configuration-Agnostic Personalized Speech Enhancement using Long-Short-Term Spatial Coherence

11/16/2022
by   Yicheng Hsu, et al.
0

Personalized speech enhancement has been a field of active research for suppression of speechlike interferers such as competing speakers or TV dialogues. Compared with single channel approaches, multichannel PSE systems can be more effective in adverse acoustic conditions by leveraging the spatial information in microphone signals. However, the implementation of multichannel PSEs to accommodate a wide range of array topology in household applications can be challenging. To develop an array configuration agnostic PSE system, we define a spatial feature termed the long short term spatial coherence as the input feature to a convolutional recurrent network to monitor the voice activity of the target speaker. As another refinement, an equivalent rectangular bandwidth scaled LSTSC feature can be used to reduce the computational cost. Experiments were conducted to compare the proposed PSE systems, including the complete and the simplified versions with two baselines using unseen room responses and array configurations in the presence of TV noise and competing speakers. The results demonstrated that the proposed multichannel PSE network trained with the LSTSC feature achieved superior enhancement performance without precise knowledge of the array configurations and room responses.

READ FULL TEXT

page 1

page 3

page 5

page 7

page 9

research
07/17/2022

Multi-channel target speech enhancement based on ERB-scaled spatial coherence features

Recently, speech enhancement technologies that are based on deep learnin...
research
12/10/2021

Learning-based personal speech enhancement for teleconferencing by exploiting spatial-spectral features

Teleconferencing is becoming essential during the COVID-19 pandemic. How...
research
10/20/2021

One model to enhance them all: array geometry agnostic multi-channel personalized speech enhancement

With the recent surge of video conferencing tools usage, providing high-...
research
07/27/2021

Microphone Array Generalization for Multichannel Narrowband Deep Speech Enhancement

This paper addresses the problem of microphone array generalization for ...
research
03/13/2023

Guided Speech Enhancement Network

High quality speech capture has been widely studied for both voice commu...
research
11/05/2022

Breaking the trade-off in personalized speech enhancement with cross-task knowledge distillation

Personalized speech enhancement (PSE) models achieve promising results c...
research
09/09/2021

BeamTransformer: Microphone Array-based Overlapping Speech Detection

We propose BeamTransformer, an efficient architecture to leverage beamfo...

Please sign up or login with your details

Forgot password? Click here to reset