DeepAI AI Chat
Log In Sign Up

Individualized Conditioning and Negative Distances for Speaker Separation

by   Tao Sun, et al.

Speaker separation aims to extract multiple voices from a mixed signal. In this paper, we propose two speaker-aware designs to improve the existing speaker separation solutions. The first model is a speaker conditioning network that integrates speech samples to generate individualized speaker conditions, which then provide informed guidance for a separation module to produce well-separated outputs. The second design aims to reduce non-target voices in the separated speech. To this end, we propose negative distances to penalize the appearance of any non-target voice in the channel outputs, and positive distances to drive the separated voices closer to the clean targets. We explore two different setups, weighted-sum and triplet-like, to integrate these two distances to form a combined auxiliary loss for the separation networks. Experiments conducted on LibriMix demonstrate the effectiveness of our proposed models.


page 1

page 2

page 3

page 4


Time-Domain Speech Extraction with Spatial Information and Multi Speaker Conditioning Mechanism

In this paper, we present a novel multi-channel speech extraction system...

Localization Based Sequential Grouping for Continuous Speech Separation

This study investigates robust speaker localization for con-tinuous spee...

Simultaneous Speech Extraction for Multiple Target Speakers under the Meeting Scenarios(V1)

Recently, the target speech separation or extraction techniques under th...

Multimodal Target Speech Separation with Voice and Face References

Target speech separation refers to isolating target speech from a multi-...

Speech Separation Based on Multi-Stage Elaborated Dual-Path Deep BiLSTM with Auxiliary Identity Loss

Deep neural network with dual-path bi-directional long short-term memory...

The sound of my voice: speaker representation loss for target voice separation

Research on content and style representations has been widely studied in...

Exploring Aligned Lyrics-Informed Singing Voice Separation

In this paper, we propose a method of utilizing aligned lyrics as additi...