Music Source Separation with Band-Split RoPE Transformer

09/05/2023
by   Wei-Tsung Lu, et al.
0

Music source separation (MSS) aims to separate a music recording into multiple musically distinct stems, such as vocals, bass, drums, and more. Recently, deep learning approaches such as convolutional neural networks (CNNs) and recurrent neural networks (RNNs) have been used, but the improvement is still limited. In this paper, we propose a novel frequency-domain approach based on a Band-Split RoPE Transformer (called BS-RoFormer). BS-RoFormer relies on a band-split module to project the input complex spectrogram into subband-level representations, and then arranges a stack of hierarchical Transformers to model the inner-band as well as inter-band sequences for multi-band mask estimation. To facilitate training the model for MSS, we propose to use the Rotary Position Embedding (RoPE). The BS-RoFormer system trained on MUSDB18HQ and 500 extra songs ranked the first place in the MSS track of Sound Demixing Challenge (SDX23). Benchmarking a smaller version of BS-RoFormer on MUSDB18HQ, we achieve state-of-the-art result without extra training data, with 9.80 dB of average SDR.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/15/2022

Hybrid Transformers for Music Source Separation

A natural question arising in Music Source Separation (MSS) is whether l...
research
09/30/2022

Music Source Separation with Band-split RNN

The performance of music source separation (MSS) models has been greatly...
research
07/06/2020

Depthwise Separable Convolutions Versus Recurrent Neural Networks for Monaural Singing Voice Separation

Recent approaches for music source separation are almost exclusively bas...
research
10/19/2020

Fast accuracy estimation of deep learning based multi-class musical source separation

Music source separation represents the task of extracting all the instru...
research
06/15/2023

Sound Demixing Challenge 2023 Music Demixing Track Technical Report: TFC-TDF-UNet v3

In this report, we present our award-winning solutions for the Music Dem...
research
03/23/2018

Convolutional vs. Recurrent Neural Networks for Audio Source Separation

Recent work has shown that recurrent neural networks can be trained to s...
research
08/14/2023

The Sound Demixing Challenge 2023 x2013 Music Demixing Track

This paper summarizes the music demixing (MDX) track of the Sound Demixi...

Please sign up or login with your details

Forgot password? Click here to reset