Adaptive Contrastive Learning on Multimodal Transformer for Review Helpfulness Predictions

11/07/2022
by   Thong Nguyen, et al.
0

Modern Review Helpfulness Prediction systems are dependent upon multiple modalities, typically texts and images. Unfortunately, those contemporary approaches pay scarce attention to polish representations of cross-modal relations and tend to suffer from inferior optimization. This might cause harm to model's predictions in numerous cases. To overcome the aforementioned issues, we propose Multimodal Contrastive Learning for Multimodal Review Helpfulness Prediction (MRHP) problem, concentrating on mutual information between input modalities to explicitly elaborate cross-modal relations. In addition, we introduce Adaptive Weighting scheme for our contrastive learning approach in order to increase flexibility in optimization. Lastly, we propose Multimodal Interaction module to address the unalignment nature of multimodal data, thereby assisting the model in producing more reasonable multimodal representations. Experimental results show that our method outperforms prior baselines and achieves state-of-the-art results on two publicly available benchmark datasets for MRHP problem.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/11/2022

Using Multiple Instance Learning to Build Multimodal Representations

Image-text multimodal representation learning aligns data across modalit...
research
09/12/2022

SANCL: Multimodal Review Helpfulness Prediction with Selective Attention and Natural Contrastive Learning

With the boom of e-commerce, Multimodal Review Helpfulness Prediction (M...
research
10/15/2021

StreaMulT: Streaming Multimodal Transformer for Heterogeneous and Arbitrary Long Sequential Data

This paper tackles the problem of processing and combining efficiently a...
research
05/26/2023

LANISTR: Multimodal Learning from Structured and Unstructured Data

Multimodal large-scale pretraining has shown impressive performance gain...
research
05/09/2023

Exploiting Pseudo Image Captions for Multimodal Summarization

Cross-modal contrastive learning in vision language pretraining (VLP) fa...
research
09/28/2017

Soft Correspondences in Multimodal Scene Parsing

Exploiting multiple modalities for semantic scene parsing has been shown...
research
06/13/2023

Enhanced Multimodal Representation Learning with Cross-modal KD

This paper explores the tasks of leveraging auxiliary modalities which a...

Please sign up or login with your details

Forgot password? Click here to reset