Convolutional vs. Recurrent Neural Networks for Audio Source Separation

03/23/2018
by   Shariq Mobin, et al.
0

Recent work has shown that recurrent neural networks can be trained to separate individual speakers in a sound mixture with high fidelity. Here we explore convolutional neural network models as an alternative and show that they achieve state-of-the-art results with an order of magnitude fewer parameters. We also characterize and compare the robustness and ability of these different approaches to generalize under three different test conditions: longer time sequences, the addition of intermittent noise, and different datasets not seen during training. For the last condition, we create a new dataset, RealTalkLibri, to test source separation in real-world environments. We show that the acoustics of the environment have significant impact on the structure of the waveform and the overall performance of neural network models, with the convolutional model showing superior ability to generalize to new environments. The code for our study is available at https://github.com/ShariqM/source_separation.

READ FULL TEXT

page 2

page 8

research
03/23/2018

Generalization Challenges for Neural Architectures in Audio Source Separation

Recent work has shown that recurrent neural networks can be trained to s...
research
06/10/2020

Speaker Diarization: Using Recurrent Neural Networks

Speaker Diarization is the problem of separating speakers in an audio. T...
research
06/12/2018

Convolutional Neural Networks for Aircraft Noise Monitoring

Air travel is one of the fastest growing modes of transportation, howeve...
research
03/03/2021

Compute and memory efficient universal sound source separation

Recent progress in audio source separation lead by deep learning has ena...
research
09/20/2017

Neural Network Alternatives to Convolutive Audio Models for Source Separation

Convolutive Non-Negative Matrix Factorization model factorizes a given a...
research
10/12/2020

The Cone of Silence: Speech Separation by Localization

Given a multi-microphone recording of an unknown number of speakers talk...
research
09/05/2023

Music Source Separation with Band-Split RoPE Transformer

Music source separation (MSS) aims to separate a music recording into mu...

Please sign up or login with your details

Forgot password? Click here to reset