Debiased Cross-modal Matching for Content-based Micro-video Background Music Recommendation

by   Jinng Yi, et al.

Micro-video background music recommendation is a complicated task where the matching degree between videos and uploader-selected background music is a major issue. However, the selection of the user-generated content (UGC) is biased caused by knowledge limitations and historical preferences among music of each uploader. In this paper, we propose a Debiased Cross-Modal (DebCM) matching model to alleviate the influence of such selection bias. Specifically, we design a teacher-student network to utilize the matching of segments of music videos, which is professional-generated content (PGC) with specialized music-matching techniques, to better alleviate the bias caused by insufficient knowledge of users. The PGC data is captured by a teacher network to guide the matching of uploader-selected UGC data of the student network by KL-based knowledge transfer. In addition, uploaders' personal preferences of music genres are identified as confounders that spuriously correlate music embeddings and background music selections, resulting in the learned recommender system to over-recommend music from the majority groups. To resolve such confounders in the UGC data of the student network, backdoor adjustment is utilized to deconfound the spurious correlation between music embeddings and prediction scores. We further utilize Monte Carlo (MC) estimator with batch-level average as the approximations to avoid integrating the entire confounder space calculated by the adjustment. Extensive experiments on the TT-150k-genre dataset demonstrate the effectiveness of the proposed method towards the selection bias. The code is publicly available on: <>.


Cross-modal Variational Auto-encoder for Content-based Micro-video Background Music Recommendation

In this paper, we propose a cross-modal variational auto-encoder (CMVAE)...

VMCML: Video and Music Matching via Cross-Modality Lifting

We propose a content-based system for matching video and background musi...

Unified Pretraining Target Based Video-music Retrieval With Music Rhythm And Video Optical Flow Information

Background music (BGM) can enhance the video's emotion. However, selecti...

Video-Music Retrieval:A Dual-Path Cross-Modal Network

We propose a method to recommend background music for videos. Current wo...

InverseMV: Composing Piano Scores with a Convolutional Video-Music Transformer

Many social media users prefer consuming content in the form of videos r...

DVR: Micro-Video Recommendation Optimizing Watch-Time-Gain under Duration Bias

Recommender systems are prone to be misled by biases in the data. Models...

Deep Deconfounded Content-based Tag Recommendation for UGC with Causal Intervention

Traditional content-based tag recommender systems directly learn the ass...

Please sign up or login with your details

Forgot password? Click here to reset