Don't shoot butterfly with rifles: Multi-channel Continuous Speech Separation with Early Exit Transformer

10/23/2020
by   Sanyuan Chen, et al.
0

With its strong modeling capacity that comes from a multi-head and multi-layer structure, Transformer is a very powerful model for learning a sequential representation and has been successfully applied to speech separation recently. However, multi-channel speech separation sometimes does not necessarily need such a heavy structure for all time frames especially when the cross-talker challenge happens only occasionally. For example, in conversation scenarios, most regions contain only a single active speaker, where the separation task downgrades to a single speaker enhancement problem. It turns out that using a very deep network structure for dealing with signals with a low overlap ratio not only negatively affects the inference efficiency but also hurts the separation performance. To deal with this problem, we propose an early exit mechanism, which enables the Transformer model to handle different cases with adaptive depth. Experimental results indicate that not only does the early exit mechanism accelerate the inference, but it also improves the accuracy.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/27/2022

Ultra Fast Speech Separation Model with Teacher Student Learning

Transformer has been successfully applied to speech separation recently ...
research
02/21/2023

DasFormer: Deep Alternating Spectrogram Transformer for Multi/Single-Channel Speech Separation

For the task of speech separation, previous study usually treats multi-c...
research
07/29/2023

Monaural Multi-Speaker Speech Separation Using Efficient Transformer Model

Cocktail party problem is the scenario where it is difficult to separate...
research
03/07/2023

Multi-Dimensional and Multi-Scale Modeling for Speech Separation Optimized by Discriminative Learning

Transformer has shown advanced performance in speech separation, benefit...
research
05/23/2020

Efficient Integration of Multi-channel Information for Speaker-independent Speech Separation

Although deep-learning-based methods have markedly improved the performa...
research
10/28/2021

Continuous Speech Separation with Recurrent Selective Attention Network

While permutation invariant training (PIT) based continuous speech separ...

Please sign up or login with your details

Forgot password? Click here to reset