A Unified Model for Zero-shot Music Source Separation, Transcription and Synthesis

08/07/2021
by   Liwei Lin, et al.
0

We propose a unified model for three inter-related tasks: 1) to separate individual sound sources from a mixed music audio, 2) to transcribe each sound source to MIDI notes, and 3) to synthesize new pieces based on the timbre of separated sources. The model is inspired by the fact that when humans listen to music, our minds can not only separate the sounds of different instruments, but also at the same time perceive high-level representations such as score and timbre. To mirror such capability computationally, we designed a pitch-timbre disentanglement module based on a popular encoder-decoder neural architecture for source separation. The key inductive biases are vector-quantization for pitch representation and pitch-transformation invariant for timbre representation. In addition, we adopted a query-by-example method to achieve zero-shot learning, i.e., the model is capable of doing source separation, transcription, and synthesis for unseen instruments. The current design focuses on audio mixtures of two monophonic instruments. Experimental results show that our model outperforms existing multi-task baselines, and the transcribed score serves as a powerful auxiliary for separation tasks.

READ FULL TEXT
research
12/15/2021

Zero-shot Audio Source Separation through Query-based Learning from Weakly-labeled Data

Deep learning techniques for separating audio into different sound sourc...
research
04/08/2020

Conditioned Source Separation for Music Instrument Performances

Separating different music instruments playing the same piece is a chall...
research
05/30/2023

A Unified Audio-Visual Learning Framework for Localization, Separation, and Recognition

The ability to accurately recognize, localize and separate sound sources...
research
09/29/2020

Bespoke Neural Networks for Score-Informed Source Separation

In this paper, we introduce a simple method that can separate arbitrary ...
research
03/01/2019

A Unified Neural Architecture for Instrumental Audio Tasks

Within Music Information Retrieval (MIR), prominent tasks -- including p...
research
07/13/2021

Towards Automatic Instrumentation by Learning to Separate Parts in Symbolic Multitrack Music

Modern keyboards allow a musician to play multiple instruments at the sa...
research
07/28/2021

Neural Remixer: Learning to Remix Music with Interactive Control

The task of manipulating the level and/or effects of individual instrume...

Please sign up or login with your details

Forgot password? Click here to reset