Ultra Fast Speech Separation Model with Teacher Student Learning

04/27/2022
by   Sanyuan Chen, et al.
0

Transformer has been successfully applied to speech separation recently with its strong long-dependency modeling capacity using a self-attention mechanism. However, Transformer tends to have heavy run-time costs due to the deep encoder layers, which hinders its deployment on edge devices. A small Transformer model with fewer encoder layers is preferred for computational efficiency, but it is prone to performance degradation. In this paper, an ultra fast speech separation Transformer model is proposed to achieve both better performance and efficiency with teacher student learning (T-S learning). We introduce layer-wise T-S learning and objective shifting mechanisms to guide the small student model to learn intermediate representations from the large teacher model. Compared with the small Transformer model trained from scratch, the proposed T-S learning method reduces the word error rate (WER) by more than 5 for both multi-channel and single-channel speech separation on LibriCSS dataset. Utilizing more unlabeled speech data, our ultra fast speech separation models achieve more than 10

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/23/2020

Don't shoot butterfly with rifles: Multi-channel Continuous Speech Separation with Early Exit Transformer

With its strong modeling capacity that comes from a multi-head and multi...
research
06/15/2021

Teacher-Student MixIT for Unsupervised and Semi-supervised Speech Separation

In this paper, we introduce a novel semi-supervised learning framework f...
research
07/05/2021

Investigation of Practical Aspects of Single Channel Speech Separation for ASR

Speech separation has been successfully applied as a frontend processing...
research
02/21/2023

DasFormer: Deep Alternating Spectrogram Transformer for Multi/Single-Channel Speech Separation

For the task of speech separation, previous study usually treats multi-c...
research
09/16/2021

Fast-Slow Transformer for Visually Grounding Speech

We present Fast-Slow Transformer for Visually Grounding Speech, or FaST-...
research
06/19/2022

Resource-Efficient Separation Transformer

Transformers have recently achieved state-of-the-art performance in spee...

Please sign up or login with your details

Forgot password? Click here to reset