Adversarial Semi-Supervised Audio Source Separation applied to Singing Voice Extraction

10/31/2017
by   Daniel Stoller, et al.
0

The state of the art in music source separation employs neural networks trained in a supervised fashion on multi-track databases to estimate the sources from a given mixture. With only few datasets available, often extensive data augmentation is used to combat overfitting. Mixing random tracks, however, can even reduce separation performance as instruments in real music are strongly correlated. The key concept in our approach is that source estimates of an optimal separator should be indistinguishable from real source signals. Based on this idea, we drive the separator towards outputs deemed as realistic by discriminator networks that are trained to tell apart real from separator samples. This way, we can also use unpaired source and mixture recordings without the drawbacks of creating unrealistic music mixtures. Our framework is widely applicable as it does not assume a specific network architecture or number of sources. To our knowledge, this is the first adoption of adversarial training for music source separation. In a prototype experiment for singing voice separation, separation performance increases with our approach compared to purely supervised training.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/19/2022

Music Source Separation with Generative Flow

Full supervision models for source separation are trained on mixture-sou...
research
08/06/2020

Mixing-Specific Data Augmentation Techniques for Improved Blind Violin/Piano Source Separation

Blind music source separation has been a popular and active subject of r...
research
03/26/2021

Modeling the Compatibility of Stem Tracks to Generate Music Mashups

A music mashup combines audio elements from two or more songs to create ...
research
12/14/2018

Semi-Supervised Monaural Singing Voice Separation With a Masking Network Trained on Synthetic Mixtures

We study the problem of semi-supervised singing voice separation, in whi...
research
07/28/2021

Neural Remixer: Learning to Remix Music with Interactive Control

The task of manipulating the level and/or effects of individual instrume...
research
10/31/2017

SVSGAN: Singing Voice Separation via Generative Adversarial Network

Separating two sources from an audio mixture is an important task with m...
research
07/09/2022

Learning to Separate Voices by Spatial Regions

We consider the problem of audio voice separation for binaural applicati...

Please sign up or login with your details

Forgot password? Click here to reset