Spectrogram Feature Losses for Music Source Separation

01/15/2019
by   Abhimanyu Sahai, et al.
0

In this paper we study deep learning-based music source separation, and explore using an alternative loss to the standard spectrogram pixel-level L2 loss for model training. Our main contribution is in demonstrating that adding a high-level feature loss term, extracted from the spectrograms using a VGG net, can improve separation quality vis-a-vis a pure pixel-level loss. We show this improvement in the context of the MMDenseNet, a State-of-the-Art deep learning model for this task, for the extraction of drums and vocal sounds from songs in the musdb18 database, covering a broad range of western music genres. We believe that this finding can be generalized and applied to broader machine learning-based systems in the audio domain.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/29/2018

End-to-end music source separation: is it possible in the waveform domain?

Most of the currently successful source separation techniques use the ma...
research
10/20/2020

The Effect of Spectrogram Reconstruction on Automatic Music Transcription: An Alternative Approach to Improve Transcription Accuracy

Most of the state-of-the-art automatic music transcription (AMT) models ...
research
02/16/2022

On loss functions and evaluation metrics for music source separation

We investigate which loss functions provide better separations via bench...
research
02/03/2021

Music source separation conditioned on 3D point clouds

Recently, significant progress has been made in audio source separation ...
research
01/28/2020

Time-Domain Audio Source Separation Based on Wave-U-Net Combined with Discrete Wavelet Transform

We propose a time-domain audio source separation method using down-sampl...
research
10/23/2020

GSEP: A robust vocal and accompaniment separation system using gated CBHG module and loudness normalization

In the field of audio signal processing research, source separation has ...
research
02/16/2023

DeepSpace: Dynamic Spatial and Source Cue Based Source Separation for Dialog Enhancement

Dialog Enhancement (DE) is a feature which allows a user to increase the...

Please sign up or login with your details

Forgot password? Click here to reset