DeepAI AI Chat
Log In Sign Up

Demucs: Deep Extractor for Music Sources with extra unlabeled data remixed

by   Alexandre Défossez, et al.

We study the problem of source separation for music using deep learning with four known sources: drums, bass, vocals and other accompaniments. State-of-the-art approaches predict soft masks over mixture spectrograms while methods working on the waveform are lagging behind as measured on the standard MusDB benchmark. Our contribution is two fold. (i) We introduce a simple convolutional and recurrent model that outperforms the state-of-the-art model on waveforms, that is, Wave-U-Net, by 1.6 points of SDR (signal to distortion ratio). (ii) We propose a new scheme to leverage unlabeled music. We train a first model to extract parts with at least one source silent in unlabeled tracks, for instance without bass. We remix this extract with a bass line taken from the supervised dataset to form a new weakly supervised training example. Combining our architecture and scheme, we show that waveform methods can play in the same ballpark as spectrogram ones.


page 1

page 2

page 3

page 4


Music Source Separation in the Waveform Domain

Source separation for music is the task of isolating contributions, or s...

Danna-Sep: Unite to separate them all

Deep learning-based music source separation has gained a lot of interest...

Hybrid Spectrogram and Waveform Source Separation

Source separation models either work on the spectrogram or waveform doma...

CatNet: music source separation system with mix-audio augmentation

Music source separation (MSS) is the task of separating a music piece in...

Modeling the Compatibility of Stem Tracks to Generate Music Mashups

A music mashup combines audio elements from two or more songs to create ...

Music Separation Enhancement with Generative Modeling

Despite phenomenal progress in recent years, state-of-the-art music sepa...

Multi-task U-Net for Music Source Separation

A fairly straightforward approach for music source separation is to trai...