Translating Visual Art into Music

The Synesthetic Variational Autoencoder (SynVAE) introduced in this research is able to learn a consistent mapping between visual and auditive sensory modalities in the absence of paired datasets. A quantitative evaluation on MNIST as well as the Behance Artistic Media dataset (BAM) shows that SynVAE is capable of retaining sufficient information content during the translation while maintaining cross-modal latent space consistency. In a qualitative evaluation trial, human evaluators were furthermore able to match musical samples with the images which generated them with accuracies of up to 73

READ FULL TEXT

page 2

page 4

research
03/22/2022

Upmixing via style transfer: a variational autoencoder for disentangling spatial images and musical content

In the stereo-to-multichannel upmixing problem for music, one of the mai...
research
07/27/2023

Graph-based Polyphonic Multitrack Music Generation

Graphs can be leveraged to model polyphonic multitrack symbolic music, w...
research
08/15/2023

EMID: An Emotional Aligned Dataset in Audio-Visual Modality

In this paper, we propose Emotionally paired Music and Image Dataset (EM...
research
03/30/2018

Cross-modal Deep Variational Hand Pose Estimation

The human hand moves in complex and high-dimensional ways, making estima...
research
02/21/2019

Latent Translation: Crossing Modalities by Bridging Generative Models

End-to-end optimization has achieved state-of-the-art performance on man...
research
07/15/2021

Cross-modal Variational Auto-encoder for Content-based Micro-video Background Music Recommendation

In this paper, we propose a cross-modal variational auto-encoder (CMVAE)...
research
12/11/2019

deepsing: Generating Sentiment-aware Visual Stories using Cross-modal Music Translation

In this paper we propose a deep learning method for performing attribute...

Please sign up or login with your details

Forgot password? Click here to reset