TRUNet: Transformer-Recurrent-U Network for Multi-channel Reverberant Sound Source Separation

10/08/2021
by   Ali Aroudi, et al.
0

In recent years, many deep learning techniques for single-channel sound source separation have been proposed using recurrent, convolutional and transformer networks. When multiple microphones are available, spatial diversity between speakers and background noise in addition to spectro-temporal diversity can be exploited by using multi-channel filters for sound source separation. Aiming at end-to-end multi-channel source separation, in this paper we propose a transformer-recurrent-U network (TRUNet), which directly estimates multi-channel filters from multi-channel input spectra. TRUNet consists of a spatial processing network with an attention mechanism across microphone channels aiming at capturing the spatial diversity, and a spectro-temporal processing network aiming at capturing spectral and temporal diversities. In addition to multi-channel filters, we also consider estimating single-channel filters from multi-channel input spectra using TRUNet. We train the network on a large reverberant dataset using a combined compressed mean-squared error loss function, which further improves the sound separation performance. We evaluate the network on a realistic and challenging reverberant dataset, generated from measured room impulse responses of an actual microphone array. The experimental results on realistic reverberant sound source separation show that the proposed TRUNet outperforms state-of-the-art single-channel and multi-channel source separation methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/22/2020

DBNET: DOA-driven beamforming network for end-to-end farfield sound source separation

Many deep learning techniques are available to perform source separation...
research
11/21/2019

WildMix Dataset and Spectro-Temporal Transformer Model for Monoaural Audio Source Separation

Monoaural audio source separation is a challenging research area in mach...
research
03/14/2023

Multi-Channel Masking with Learnable Filterbank for Sound Source Separation

This work proposes a learnable filterbank based on a multi-channel maski...
research
04/17/2023

Fast Random Approximation of Multi-channel Room Impulse Response

Modern neural-network-based speech processing systems are typically requ...
research
11/18/2017

Separake: Source Separation with a Little Help From Echoes

It is commonly believed that multipath hurts various audio processing al...
research
08/22/2022

Exploiting Temporal Structures of Cyclostationary Signals for Data-Driven Single-Channel Source Separation

We study the problem of single-channel source separation (SCSS), and foc...
research
06/30/2022

Implicit Neural Spatial Filtering for Multichannel Source Separation in the Waveform Domain

We present a single-stage casual waveform-to-waveform multichannel model...

Please sign up or login with your details

Forgot password? Click here to reset