Log In Sign Up

A Study of Transfer Learning in Music Source Separation

by   Andreas Bugler, et al.

Supervised deep learning methods for performing audio source separation can be very effective in domains where there is a large amount of training data. While some music domains have enough data suitable for training a separation system, such as rock and pop genres, many musical domains do not, such as classical music, choral music, and non-Western music traditions. It is well known that transferring learning from related domains can result in a performance boost for deep learning systems, but it is not always clear how best to do pretraining. In this work we investigate the effectiveness of data augmentation during pretraining, the impact on performance as a result of pretraining and downstream datasets having similar content domains, and also explore how much of a model must be retrained on the final target task, once pretrained.


Cutting Music Source Separation Some Slakh: A Dataset to Study the Impact of Training Data Quality and Quantity

Music source separation performance has greatly improved in recent years...

Towards robust music source separation on loud commercial music

Nowadays, commercial music has extreme loudness and heavily compressed d...

PodcastMix: A dataset for separating music and speech in podcasts

We introduce PodcastMix, a dataset formalizing the task of separating ba...

A Hands-on Comparison of DNNs for Dialog Separation Using Transfer Learning from Music Source Separation

This paper describes a hands-on comparison on using state-of-the-art mus...

Singer separation for karaoke content generation

Due to the rapid development of deep learning, we can now successfully s...

Spectrogram Feature Losses for Music Source Separation

In this paper we study deep learning-based music source separation, and ...