Dual-Path Transformer Network: Direct Context-Aware Modeling for End-to-End Monaural Speech Separation

07/28/2020
by   Jingjing Chen, et al.
0

The dominant speech separation models are based on complex recurrent or convolution neural network that model speech sequences indirectly conditioning on context, such as passing information through many intermediate states in recurrent neural network, leading to suboptimal separation performance. In this paper, we propose a dual-path transformer network (DPTNet) for end-to-end speech separation, which introduces direct context-awareness in the modeling for speech sequences. By introduces a improved transformer, elements in speech sequences can interact directly, which enables DPTNet can model for the speech sequences with direct context-awareness. The improved transformer in our approach learns the order information of the speech sequences without positional encodings by incorporating a recurrent neural network into the original transformer. In addition, the structure of dual paths makes our model efficient for extremely long speech sequence modeling. Extensive experiments on benchmark datasets show that our approach outperforms the current state-of-the-arts (20.6 dB SDR on the public WSj0-2mix data corpus).

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/23/2020

La Furca: Iterative Context-Aware End-to-End Monaural Speech Separation Based on Dual-Path Deep Parallel Inter-Intra Bi-LSTM with Attention

Deep neural network with dual-path bi-directional long short-term memory...
research
10/14/2019

Dual-path RNN: efficient long sequence modeling for time-domain single-channel speech separation

Recent studies in deep learning-based speech separation have proven the ...
research
12/04/2021

Speech Separation Using an Asynchronous Fully Recurrent Convolutional Neural Network

Recent advances in the design of neural network architectures, in partic...
research
06/28/2022

Tiny-Sepformer: A Tiny Time-Domain Transformer Network for Speech Separation

Time-domain Transformer neural networks have proven their superiority in...
research
08/30/2023

Dual-path Transformer Based Neural Beamformer for Target Speech Extraction

Neural beamformers, which integrate both pre-separation and beamforming ...
research
02/23/2021

Dual-Path Modeling for Long Recording Speech Separation in Meetings

The continuous speech separation (CSS) is a task to separate the speech ...
research
03/25/2022

Embedding Recurrent Layers with Dual-Path Strategy in a Variant of Convolutional Network for Speaker-Independent Speech Separation

Speaker-independent speech separation has achieved remarkable performanc...

Please sign up or login with your details

Forgot password? Click here to reset