DeepAI AI Chat
Log In Sign Up

End-to-end music source separation: is it possible in the waveform domain?

by   Francesc Lluís, et al.

Most of the currently successful source separation techniques use the magnitude spectrogram as input, and are therefore by default omitting part of the signal: the phase. In order to avoid omitting potentially useful information, we study the viability of using end-to-end models for music source separation. By operating directly over the waveform, these models take into account all the information available in the raw audio signal, including the phase. Our results show that waveform-based models can outperform a recent spectrogram-based deep learning model. Namely, a novel Wavenet-based model we propose and Wave-U-Net can outperform DeepConvSep, a spectrogram-based deep learning model. This suggests that end-to-end learning has a great potential for the problem of music source separation.


page 1

page 2

page 3

page 4


Hybrid Y-Net Architecture for Singing Voice Separation

This research paper presents a novel deep learning-based neural network ...

Music Source Separation in the Waveform Domain

Source separation for music is the task of isolating contributions, or s...

End-to-end Networks for Supervised Single-channel Speech Separation

The performance of single channel source separation algorithms has impro...

End-to-End Sound Source Separation Conditioned On Instrument Labels

Can we perform an end-to-end sound source separation (SSS) with a variab...

Wave-U-Net: A Multi-Scale Neural Network for End-to-End Audio Source Separation

Models for audio source separation usually operate on the magnitude spec...

Spectrogram Feature Losses for Music Source Separation

In this paper we study deep learning-based music source separation, and ...

Does Phase Matter For Monaural Source Separation?

The "cocktail party" problem of fully separating multiple sources from a...