Music source separation conditioned on 3D point clouds

02/03/2021
by   Francesc Lluís, et al.
0

Recently, significant progress has been made in audio source separation by the application of deep learning techniques. Current methods that combine both audio and visual information use 2D representations such as images to guide the separation process. However, in order to (re)-create acoustically correct scenes for 3D virtual/augmented reality applications from recordings of real music ensembles, detailed information about each sound source in the 3D environment is required. This demand, together with the proliferation of 3D visual acquisition systems like LiDAR or rgb-depth cameras, stimulates the creation of models that can guide the audio separation using 3D visual information. This paper proposes a multi-modal deep learning model to perform music source separation conditioned on 3D point clouds of music performance recordings. This model extracts visual features using 3D sparse convolutions, while audio features are extracted using dense convolutions. A fusion module combines the extracted features to finally perform the audio source separation. It is shown, that the presented model can distinguish the musical instruments from a single 3D point cloud frame, and perform source separation qualitatively similar to a reference case, where manually assigned instrument labels are provided.

READ FULL TEXT
research
10/27/2020

Remixing Music with Visual Conditioning

We propose a visually conditioned music remixing system by incorporating...
research
04/26/2021

Points2Sound: From mono to binaural audio using 3D point cloud scenes

Binaural sound that matches the visual counterpart is crucial to bring m...
research
06/14/2020

Solos: A Dataset for Audio-Visual Music Analysis

In this paper, we present a new dataset of music performance videos whic...
research
06/04/2019

Dilated Convolution with Dilated GRU for Music Source Separation

Stacked dilated convolutions used in Wavenet have been shown effective f...
research
01/15/2019

Spectrogram Feature Losses for Music Source Separation

In this paper we study deep learning-based music source separation, and ...
research
03/01/2019

A Unified Neural Architecture for Instrumental Audio Tasks

Within Music Information Retrieval (MIR), prominent tasks -- including p...
research
05/15/2021

Move2Hear: Active Audio-Visual Source Separation

We introduce the active audio-visual source separation problem, where an...

Please sign up or login with your details

Forgot password? Click here to reset