MMGA: Multimodal Learning with Graph Alignment

10/18/2022
by   Xuan Yang, et al.
0

Multimodal pre-training breaks down the modality barriers and allows the individual modalities to be mutually augmented with information, resulting in significant advances in representation learning. However, graph modality, as a very general and important form of data, cannot be easily interacted with other modalities because of its non-regular nature. In this paper, we propose MMGA (Multimodal learning with Graph Alignment), a novel multimodal pre-training framework to incorporate information from graph (social network), image and text modalities on social media to enhance user representation learning. In MMGA, a multi-step graph alignment mechanism is proposed to add the self-supervision from graph modality to optimize the image and text encoders, while using the information from the image and text modalities to guide the graph encoder learning. We conduct experiments on the dataset crawled from Instagram. The experimental results show that MMGA works well on the dataset and improves the fans prediction task's performance. We release our dataset, the first social media multimodal dataset with graph, of 60,000 users labeled with specific topics based on 2 million posts to facilitate future research.

READ FULL TEXT

page 1

page 2

page 3

research
07/01/2023

Image Matters: A New Dataset and Empirical Study for Multimodal Hyperbole Detection

Hyperbole, or exaggeration, is a common linguistic phenomenon. The detec...
research
04/14/2020

Analysis of Social Media Data using Multimodal Deep Learning for Disaster Response

Multimedia content in social media platforms provides significant inform...
research
02/11/2022

On the Complementarity of Images and Text for the Expression of Emotions in Social Media

Authors of posts in social media communicate their emotions and what cau...
research
10/05/2022

Vision+X: A Survey on Multimodal Learning in the Light of Data

We are perceiving and communicating with the world in a multisensory man...
research
11/26/2021

Geometric Multimodal Deep Learning with Multi-Scaled Graph Wavelet Convolutional Network

Multimodal data provide complementary information of a natural phenomeno...
research
09/14/2022

Graph Perceiver IO: A General Architecture for Graph Structured Data

Multimodal machine learning has been widely studied for the development ...
research
04/19/2019

Integrating Text and Image: Determining Multimodal Document Intent in Instagram Posts

Computing author intent from multimodal data like Instagram posts requir...

Please sign up or login with your details

Forgot password? Click here to reset