A Contextual Latent Space Model: Subsequence Modulation in Melodic Sequence

11/23/2021
by   Taketo Akama, et al.
0

Some generative models for sequences such as music and text allow us to edit only subsequences, given surrounding context sequences, which plays an important part in steering generation interactively. However, editing subsequences mainly involves randomly resampling subsequences from a possible generation space. We propose a contextual latent space model (CLSM) in order for users to be able to explore subsequence generation with a sense of direction in the generation space, e.g., interpolation, as well as exploring variations – semantically similar possible subsequences. A context-informed prior and decoder constitute the generative model of CLSM, and a context position-informed encoder is the inference model. In experiments, we use a monophonic symbolic music dataset, demonstrating that our contextual latent space is smoother in interpolation than baselines, and the quality of generated samples is superior to baseline models. The generation examples are available online.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/01/2021

Is Disentanglement enough? On Latent Representations for Controllable Music Generation

Improving controllability or the ability to manipulate one or more attri...
research
05/08/2021

On Linear Interpolation in the Latent Space of Deep Generative Models

The underlying geometrical structure of the latent space in deep generat...
research
01/15/2020

Learning a Latent Space of Style-Aware Symbolic Music Representations by Adversarial Autoencoders

We address the challenging open problem of learning an effective latent ...
research
11/07/2021

NeurInt : Learning to Interpolate through Neural ODEs

A wide range of applications require learning image generation models wh...
research
08/10/2023

Exploring XAI for the Arts: Explaining Latent Space in Generative Music

Explainable AI has the potential to support more interactive and fluid c...
research
11/21/2022

Exploring the Effectiveness of Mask-Guided Feature Modulation as a Mechanism for Localized Style Editing of Real Images

The success of Deep Generative Models at high-resolution image generatio...
research
05/05/2021

Exploring emotional prototypes in a high dimensional TTS latent space

Recent TTS systems are able to generate prosodically varied and realisti...

Please sign up or login with your details

Forgot password? Click here to reset