Modified Parametric Multichannel Wiener Filter for Low-latency Enhancement of Speech Mixtures with Unknown Number of Speakers

06/29/2023
by   Ning Guo, et al.
0

This paper introduces a novel low-latency online beamforming (BF) algorithm, named Modified Parametric Multichannel Wiener Filter (Mod-PMWF), for enhancing speech mixtures with unknown and varying number of speakers. Although conventional BFs such as linearly constrained minimum variance BF (LCMV BF) can enhance a speech mixture, they typically require such attributes of the speech mixture as the number of speakers and the acoustic transfer functions (ATFs) from the speakers to the microphones. When the mixture attributes are unavailable, estimating them by low-latency processing is challenging, hindering the application of the BFs to the problem. In this paper, we overcome this problem by modifying a conventional Parametric Multichannel Wiener Filter (PMWF). The proposed Mod-PMWF can adaptively form a directivity pattern that enhances all the speakers in the mixture without explicitly estimating these attributes. Our experiments will show the proposed BF's effectiveness in interference reduction ratios and subjective listening tests.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/30/2022

Coarse-to-Fine Recursive Speech Separation for Unknown Number of Speakers

The vast majority of speech separation methods assume that the number of...
research
05/31/2023

UNSSOR: Unsupervised Neural Speech Separation by Leveraging Over-determined Training Mixtures

In reverberant conditions with multiple concurrent speakers, each microp...
research
04/13/2019

Low-Latency Speaker-Independent Continuous Speech Separation

Speaker independent continuous speech separation (SI-CSS) is a task of c...
research
01/26/2022

SkiM: Skipping Memory LSTM for Low-Latency Real-Time Continuous Speech Separation

Continuous speech separation for meeting pre-processing has recently bec...
research
11/03/2022

Iterative autoregression: a novel trick to improve your low-latency speech enhancement model

Streaming models are an essential component of real-time speech enhancem...
research
02/19/2019

Low-Latency Deep Clustering For Speech Separation

This paper proposes a low algorithmic latency adaptation of the deep clu...
research
10/19/2021

A mathematical model of the vowel space

The articulatory-acoustic relationship is many-to-one and non linear and...

Please sign up or login with your details

Forgot password? Click here to reset