Decoupling Magnitude and Phase Estimation with Deep ResUNet for Music Source Separation

09/12/2021
by   Qiuqiang Kong, et al.
0

Deep neural network based methods have been successfully applied to music source separation. They typically learn a mapping from a mixture spectrogram to a set of source spectrograms, all with magnitudes only. This approach has several limitations: 1) its incorrect phase reconstruction degrades the performance, 2) it limits the magnitude of masks between 0 and 1 while we observe that 22 in a popular dataset, MUSDB18, 3) its potential on very deep architectures is under-explored. Our proposed system is designed to overcome these. First, we propose to estimate phases by estimating complex ideal ratio masks (cIRMs) where we decouple the estimation of cIRMs into magnitude and phase estimations. Second, we extend the separation method to effectively allow the magnitude of the mask to be larger than 1. Finally, we propose a residual UNet architecture with up to 143 layers. Our proposed system achieves a state-of-the-art MSS result on the MUSDB18 dataset, especially, a SDR of 8.98 dB on vocals, outperforming the previous best performance of 7.24 dB. The source code is available at: https://github.com/bytedance/music_source_separation

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/09/2021

CWS-PResUNet: Music Source Separation with Channel-wise Subband Phase-aware ResUNet

Music source separation (MSS) shows active progress with deep learning m...
research
08/01/2022

Accurate Polygonal Mapping of Buildings in Satellite Imagery

This paper studies the problem of polygonal mapping of buildings by tack...
research
03/23/2021

Learned complex masks for multi-instrument source separation

Music source separation in the time-frequency domain is commonly achieve...
research
11/14/2022

MedleyVox: An Evaluation Dataset for Multiple Singing Voices Separation

Separation of multiple singing voices into each voice is a rarely studie...
research
06/15/2022

On the Use of Deep Mask Estimation Module for Neural Source Separation Systems

Most of the recent neural source separation systems rely on a masking-ba...
research
06/27/2023

RMVPE: A Robust Model for Vocal Pitch Estimation in Polyphonic Music

Vocal pitch is an important high-level feature in music audio processing...
research
10/13/2021

Music Source Separation with Deep Equilibrium Models

While deep neural network-based music source separation (MSS) is very ef...

Please sign up or login with your details

Forgot password? Click here to reset