Musical Instrument Separation on Shift-Invariant Spectrograms via Stochastic Dictionary Learning

06/01/2018
by   Sören Schulze, et al.
0

We propose a method for the blind separation of audio signals from musical instruments. While the approach of applying non-negative matrix factorization (NMF) has been studied in many papers, it does not make use of the pitch-invariance that instruments exhibit. This limitation can be overcome by using tensor factorization, in which context the use of log-frequency spectrograms was initiated, but this still requires the specific tuning of the instruments to be hard-coded into the algorithm. We develop a time-frequency representation that is both shift-invariant and frequency-aligned, with a variant that can also be used for wideband signals. Our separation algorithm exploits this shift-invariance in order to find patterns of peaks related to specific instruments, while non-linear optimization enables it to represent arbitrary frequencies and incorporate inharmonicity, and the reasonability of the representation is ensured by a sparsity condition. The relative amplitudes of the harmonics are saved in a dictionary, which is trained via a modified version of ADAM. For a realistic monaural piece with acoustic recorder and violin, we achieve qualitatively good separation with a signal-to-distortion ratio (SDR) of 12.7 dB, a signal-to-interference ratio (SIR) of 27.0 dB, and a signal-to-artifacts ratio (SAR) of 12.9 dB, averaged.

READ FULL TEXT
research
07/09/2021

Blind Source Separation in Polyphonic Music Recordings Using Deep Neural Networks Trained via Policy Gradients

We propose a method for the blind separation of sounds of musical instru...
research
01/12/2018

Separation of Instrument Sounds using Non-negative Matrix Factorization with Spectral Envelope Constraints

Spectral envelope is one of the most important features that characteriz...
research
07/05/2018

Sparse Representation and Non-Negative Matrix Factorization for image denoise

Recently, the problem of blind image separation has been widely investig...
research
07/29/2014

NMF with Sparse Regularizations in Transformed Domains

Non-negative blind source separation (non-negative BSS), which is also r...
research
03/27/2021

Feature-based Representation for Violin Bridge Admittances

Frequency Response Functions (FRFs) are one of the cornerstones of music...
research
12/13/2013

Sample Complexity of Dictionary Learning and other Matrix Factorizations

Many modern tools in machine learning and signal processing, such as spa...
research
09/30/2016

Optimal spectral transportation with application to music transcription

Many spectral unmixing methods rely on the non-negative decomposition of...

Please sign up or login with your details

Forgot password? Click here to reset