Music Source Separation with Band-split RNN

09/30/2022
by   Yi Luo, et al.
0

The performance of music source separation (MSS) models has been greatly improved in recent years thanks to the development of novel neural network architectures and training pipelines. However, recent model designs for MSS were mainly motivated by other audio processing tasks or other research fields, while the intrinsic characteristics and patterns of the music signals were not fully discovered. In this paper, we propose band-split RNN (BSRNN), a frequency-domain model that explictly splits the spectrogram of the mixture into subbands and perform interleaved band-level and sequence-level modeling. The choices of the bandwidths of the subbands can be determined by a priori knowledge or expert knowledge on the characteristics of the target source in order to optimize the performance on a certain type of target musical instrument. To better make use of unlabeled data, we also describe a semi-supervised model finetuning pipeline that can further improve the performance of the model. Experiment results show that BSRNN trained only on MUSDB18-HQ dataset significantly outperforms several top-ranking models in Music Demixing (MDX) Challenge 2021, and the semi-supervised finetuning stage further improves the performance on all four instrument tracks.

READ FULL TEXT
research
07/14/2021

Multi-Task Audio Source Separation

The audio source separation tasks, such as speech enhancement, speech se...
research
09/18/2019

Cutting Music Source Separation Some Slakh: A Dataset to Study the Impact of Training Data Quality and Quantity

Music source separation performance has greatly improved in recent years...
research
09/05/2023

Music Source Separation with Band-Split RoPE Transformer

Music source separation (MSS) aims to separate a music recording into mu...
research
03/26/2021

Modeling the Compatibility of Stem Tracks to Generate Music Mashups

A music mashup combines audio elements from two or more songs to create ...
research
09/15/2023

Music Source Separation Based on a Lightweight Deep Learning Framework (DTTNET: DUAL-PATH TFC-TDF UNET)

Music source separation (MSS) aims to extract 'vocals', 'drums', 'bass' ...
research
10/13/2021

Music Source Separation with Deep Equilibrium Models

While deep neural network-based music source separation (MSS) is very ef...
research
12/14/2018

Semi-Supervised Monaural Singing Voice Separation With a Masking Network Trained on Synthetic Mixtures

We study the problem of semi-supervised singing voice separation, in whi...

Please sign up or login with your details

Forgot password? Click here to reset