Video Background Music Generation: Dataset, Method and Evaluation

11/21/2022
by   Le Zhuo, et al.
0

Music is essential when editing videos, but selecting music manually is difficult and time-consuming. Thus, we seek to automatically generate background music tracks given video input. This is a challenging task since it requires plenty of paired videos and music to learn their correspondence. Unfortunately, there exist no such datasets. To close this gap, we introduce a dataset, benchmark model, and evaluation metric for video background music generation. We introduce SymMV, a video and symbolic music dataset, along with chord, rhythm, melody, and accompaniment annotations. To the best of our knowledge, it is the first video-music dataset with high-quality symbolic music and detailed annotations. We also propose a benchmark video background music generation framework named V-MusProd, which utilizes music priors of chords, melody, and accompaniment along with video-music relations of semantic, color, and motion features. To address the lack of objective metrics for video-music correspondence, we propose a retrieval-based metric VMCP built upon a powerful video-music representation learning model. Experiments show that with our dataset, V-MusProd outperforms the state-of-the-art method in both music quality and correspondence with videos. We believe our dataset, benchmark model, and evaluation metric will boost the development of video background music generation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/16/2021

Video Background Music Generation with Controllable Music Transformer

In this work, we address the task of video background music generation. ...
research
12/31/2021

InverseMV: Composing Piano Scores with a Convolutional Video-Music Transformer

Many social media users prefer consuming content in the form of videos r...
research
11/17/2022

ComMU: Dataset for Combinatorial Music Generation

Commercial adoption of automatic music composition requires the capabili...
research
06/14/2022

It's Time for Artistic Correspondence in Music and Video

We present an approach for recommending a music track for a given video,...
research
05/29/2019

Automatic Realistic Music Video Generation from Segments of Youtube Videos

A Music Video (MV) is a video aiming at visually illustrating or extendi...
research
06/21/2020

Lyric Video Analysis Using Text Detection and Tracking

We attempt to recognize and track lyric words in lyric videos. Lyric vid...
research
11/19/2022

EDGE: Editable Dance Generation From Music

Dance is an important human art form, but creating new dances can be dif...

Please sign up or login with your details

Forgot password? Click here to reset