Is Disentanglement enough? On Latent Representations for Controllable Music Generation

08/01/2021
by   Ashis Pati, et al.
0

Improving controllability or the ability to manipulate one or more attributes of the generated data has become a topic of interest in the context of deep generative models of music. Recent attempts in this direction have relied on learning disentangled representations from data such that the underlying factors of variation are well separated. In this paper, we focus on the relationship between disentanglement and controllability by conducting a systematic study using different supervised disentanglement learning algorithms based on the Variational Auto-Encoder (VAE) architecture. Our experiments show that a high degree of disentanglement can be achieved by using different forms of supervision to train a strong discriminative encoder. However, in the absence of a strong generative decoder, disentanglement does not necessarily imply controllability. The structure of the latent space with respect to the VAE-decoder plays an important role in boosting the ability of a generative model to manipulate different attributes. To this end, we also propose methods and metrics to help evaluate the quality of a latent space with respect to the afforded degree of controllability.

READ FULL TEXT

page 3

page 4

page 5

research
04/11/2020

Attribute-based Regularization of VAE Latent Spaces

Selective manipulation of data attributes using deep generative models i...
research
11/23/2021

A Contextual Latent Space Model: Subsequence Modulation in Melodic Sequence

Some generative models for sequences such as music and text allow us to ...
research
10/11/2021

Evaluation of Latent Space Disentanglement in the Presence of Interdependent Attributes

Controllable music generation with deep generative models has become inc...
research
09/15/2022

Domain Adversarial Training on Conditional Variational Auto-Encoder for Controllable Music Generation

The variational auto-encoder has become a leading framework for symbolic...
research
09/29/2019

MG-VAE: Deep Chinese Folk Songs Generation with Specific Regional Style

Regional style in Chinese folk songs is a rich treasure that can be used...
research
06/09/2019

Deep Music Analogy Via Latent Representation Disentanglement

Analogy is a key solution to automated music generation, featured by its...
research
05/16/2023

ProtoVAE: Prototypical Networks for Unsupervised Disentanglement

Generative modeling and self-supervised learning have in recent years ma...

Please sign up or login with your details

Forgot password? Click here to reset