CLMLF:A Contrastive Learning and Multi-Layer Fusion Method for Multimodal Sentiment Detection

04/12/2022
by   Zhen Li, et al.
0

Compared with unimodal data, multimodal data can provide more features to help the model analyze the sentiment of data. Previous research works rarely consider token-level feature fusion, and few works explore learning the common features related to sentiment in multimodal data to help the model fuse multimodal features. In this paper, we propose a Contrastive Learning and Multi-Layer Fusion (CLMLF) method for multimodal sentiment detection. Specifically, we first encode text and image to obtain hidden representations, and then use a multi-layer fusion module to align and fuse the token-level features of text and image. In addition to the sentiment analysis task, we also designed two contrastive learning tasks, label based contrastive learning and data based contrastive learning tasks, which will help the model learn common features related to sentiment in multimodal data. Extensive experiments conducted on three publicly available multimodal datasets demonstrate the effectiveness of our approach for multimodal sentiment detection compared with existing methods. The codes are available for use at https://github.com/Link-Li/CLMLF

READ FULL TEXT

page 1

page 7

research
11/12/2022

Few-shot Multimodal Sentiment Analysis based on Multimodal Probabilistic Fusion Prompts

Multimodal sentiment analysis is a trending topic with the explosion of ...
research
06/27/2023

ConKI: Contrastive Knowledge Injection for Multimodal Sentiment Analysis

Multimodal Sentiment Analysis leverages multimodal signals to detect the...
research
03/01/2022

Sentiment Word Aware Multimodal Refinement for Multimodal Sentiment Analysis with ASR Errors

Multimodal sentiment analysis has attracted increasing attention and lot...
research
12/02/2021

ScaleVLAD: Improving Multimodal Sentiment Analysis via Multi-Scale Fusion of Locally Descriptors

Fusion technique is a key research topic in multimodal sentiment analysi...
research
08/09/2021

FiLMing Multimodal Sarcasm Detection with Attention

Sarcasm detection identifies natural language expressions whose intended...
research
05/04/2023

Multi-Modality Deep Network for JPEG Artifacts Reduction

In recent years, many convolutional neural network-based models are desi...
research
04/18/2022

Gated Multimodal Fusion with Contrastive Learning for Turn-taking Prediction in Human-robot Dialogue

Turn-taking, aiming to decide when the next speaker can start talking, i...

Please sign up or login with your details

Forgot password? Click here to reset