DeepAI AI Chat
Log In Sign Up

Speaker Separation Using Speaker Inventories and Estimated Speech

by   Peidong Wang, et al.

We propose speaker separation using speaker inventories and estimated speech (SSUSIES), a framework leveraging speaker profiles and estimated speech for speaker separation. SSUSIES contains two methods, speaker separation using speaker inventories (SSUSI) and speaker separation using estimated speech (SSUES). SSUSI performs speaker separation with the help of speaker inventory. By combining the advantages of permutation invariant training (PIT) and speech extraction, SSUSI significantly outperforms conventional approaches. SSUES is a widely applicable technique that can substantially improve speaker separation performance using the output of first-pass separation. We evaluate the models on both speaker separation and speech recognition metrics.


page 1

page 2

page 3

page 4


Guided Training: A Simple Method for Single-channel Speaker Separation

Deep learning has shown a great potential for speech separation, especia...

Online Binaural Speech Separation of Moving Speakers With a Wavesplit Network

Binaural speech separation in real-world scenarios often involves moving...

Multi-resolution location-based training for multi-channel continuous speech separation

The performance of automatic speech recognition (ASR) systems severely d...

Localization Based Sequential Grouping for Continuous Speech Separation

This study investigates robust speaker localization for con-tinuous spee...

Separation Guided Speaker Diarization in Realistic Mismatched Conditions

We propose a separation guided speaker diarization (SGSD) approach by fu...

What can predictive speech coders learn from speaker recognizers?

This paper compares the speech coder and speaker recognizer applications...

Improved Speaker-Dependent Separation for CHiME-5 Challenge

This paper summarizes several follow-up contributions for improving our ...