Speaker-Conditional Chain Model for Speech Separation and Extraction

06/25/2020
by   Jing Shi, et al.
0

Speech separation has been extensively explored to tackle the cocktail party problem. However, these studies are still far from having enough generalization capabilities for real scenarios. In this work, we raise a common strategy named Speaker-Conditional Chain Model to process complex speech recordings. In the proposed method, our model first infers the identities of variable numbers of speakers from the observation based on a sequence-to-sequence model. Then, it takes the information from the inferred speakers as conditions to extract their speech sources. With the predicted speaker information from whole observation, our model is helpful to solve the problem of conventional speech separation and speaker extraction for multi-round long recordings. The experiments from standard fully-overlapped speech separation benchmarks show comparable results with prior studies, while our proposed model gets better adaptability for multi-round long recordings.

READ FULL TEXT
research
06/02/2020

Neural Speaker Diarization with Speaker-Wise Chain Rule

Speaker diarization is an essential step for processing multi-speaker au...
research
02/08/2021

Speaker and Direction Inferred Dual-channel Speech Separation

Most speech separation methods, trying to separate all channel sources s...
research
12/17/2020

Continuous Speech Separation Using Speaker Inventory for Long Multi-talker Recording

Leveraging additional speaker information to facilitate speech separatio...
research
04/04/2022

An Initialization Scheme for Meeting Separation with Spatial Mixture Models

Spatial mixture model (SMM) supported acoustic beamforming has been exte...
research
05/14/2020

FaceFilter: Audio-visual speech separation using still images

The objective of this paper is to separate a target speaker's speech fro...
research
03/26/2014

Constrained speaker linking

In this paper we study speaker linking (a.k.a. partitioning) given const...
research
03/30/2022

Disentangling the Impacts of Language and Channel Variability on Speech Separation Networks

Because the performance of speech separation is excellent for speech in ...

Please sign up or login with your details

Forgot password? Click here to reset