Video-to-Music Recommendation using Temporal Alignment of Segments

06/12/2023
by   Laure Prétet, et al.
0

We study cross-modal recommendation of music tracks to be used as soundtracks for videos. This problem is known as the music supervision task. We build on a self-supervised system that learns a content association between music and video. In addition to the adequacy of content, adequacy of structure is crucial in music supervision to obtain relevant recommendations. We propose a novel approach to significantly improve the system's performance using structure-aware recommendation. The core idea is to consider not only the full audio-video clips, but rather shorter segments for training and inference. We find that using semantic segments and ranking the tracks according to sequence alignment costs significantly improves the results. We investigate the impact of different ranking metrics and segmentation methods.

READ FULL TEXT

page 3

page 7

page 11

page 12

page 14

research
04/30/2021

Cross-Modal Music-Video Recommendation: A Study of Design Choices

In this work, we study music/video cross-modal recommendation, i.e. reco...
research
08/02/2021

Is there a "language of music-video clips" ? A qualitative and quantitative study

Recommending automatically a video given a music or a music given a vide...
research
10/02/2018

Diversifying Music Recommendations

We compare submodular and Jaccard methods to diversify Amazon Music reco...
research
11/07/2017

Non-uniform time-scaling of Carnatic music transients

Gamakas are an integral aspect of Carnatic Music, a form of classical mu...
research
09/17/2020

DanceIt: Music-inspired Dancing Video Synthesis

Close your eyes and listen to music, one can easily imagine an actor dan...
research
06/14/2022

It's Time for Artistic Correspondence in Music and Video

We present an approach for recommending a music track for a given video,...
research
07/15/2021

Cross-modal Variational Auto-encoder for Content-based Micro-video Background Music Recommendation

In this paper, we propose a cross-modal variational auto-encoder (CMVAE)...

Please sign up or login with your details

Forgot password? Click here to reset