A Tale of Two Graphs: Freezing and Denoising Graph Structures for Multimodal Recommendation

11/13/2022
by   Xin Zhou, et al.
1

Multimodal recommender systems utilizing multimodal features (e.g. images and textual descriptions) typically show better recommendation accuracy than general recommendation models based solely on user-item interactions. Generally, prior work fuses multimodal features into item ID embeddings to enrich item representations, thus failing to capture the latent semantic item-item structures. In this context, LATTICE [1] proposes to learn the latent structure between items explicitly and achieves state-of-the-art performance for multimodal recommendations. However, we argue the latent graph structure learning of LATTICE is both inefficient and unnecessary. Experimentally, we demonstrate that freezing its item-item structure before training can also achieve competitive performance. Based on this finding, we propose a simple yet effective model, dubbed as FREEDOM, that FREEzes the item-item graph and DenOises the user-item interaction graph simultaneously for Multimodal recommendation. In denoising the user-item interaction graph, we devise a degree-sensitive edge pruning method, which rejects possibly noisy edges with a high probability when sampling the graph. We evaluate the proposed model on three real-world datasets and show that FREEDOM can significantly outperform the strongest baselines. Compared with LATTICE, FREEDOM achieves an average improvement of 19.07 up to 6× on large graphs. The source code is available at: https://github.com/enoche/FREEDOM.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset