Multi-Modal Self-Supervised Learning for Recommendation

02/21/2023
by   Wei Wei, et al.
0

The online emergence of multi-modal sharing platforms (eg, TikTok, Youtube) is powering personalized recommender systems to incorporate various modalities (eg, visual, textual and acoustic) into the latent user representations. While existing works on multi-modal recommendation exploit multimedia content features in enhancing item embeddings, their model representation capability is limited by heavy label reliance and weak robustness on sparse user behavior data. Inspired by the recent progress of self-supervised learning in alleviating label scarcity issue, we explore deriving self-supervision signals with effectively learning of modality-aware user preference and cross-modal dependencies. To this end, we propose a new Multi-Modal Self-Supervised Learning (MMSSL) method which tackles two key challenges. Specifically, to characterize the inter-dependency between the user-item collaborative view and item multi-modal semantic view, we design a modality-aware interactive structure learning paradigm via adversarial perturbations for data augmentation. In addition, to capture the effects that user's modality-aware interaction pattern would interweave with each other, a cross-modal contrastive learning approach is introduced to jointly preserve the inter-modal semantic commonality and user preference diversity. Experiments on real-world datasets verify the superiority of our method in offering great potential for multimedia recommendation over various state-of-the-art baselines. The implementation is released at: https://github.com/HKUDS/MMSSL.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/13/2022

Bootstrap Latent Representations for Multi-modal Recommendation

This paper studies the multi-modal recommendation problem, where the ite...
research
08/14/2023

MM-GEF: Multi-modal representation meet collaborative filtering

In modern e-commerce, item content features in various modalities offer ...
research
08/24/2023

Preserving Modality Structure Improves Multi-Modal Learning

Self-supervised learning on large-scale multi-modal datasets allows lear...
research
07/06/2023

Cross-Modal Content Inference and Feature Enrichment for Cold-Start Recommendation

Multimedia recommendation aims to fuse the multi-modal information of it...
research
05/22/2023

Denoised Self-Augmented Learning for Social Recommendation

Social recommendation is gaining increasing attention in various online ...
research
09/19/2018

Adversarial Training Towards Robust Multimedia Recommender System

With the prevalence of multimedia content on the Web, developing recomme...
research
11/13/2017

A Supervised Learning Concept for Reducing User Interaction in Passenger Cars

In this article an automation system for human-machine-interfaces (HMI) ...

Please sign up or login with your details

Forgot password? Click here to reset