Cross-Media Keyphrase Prediction: A Unified Framework with Multi-Modality Multi-Head Attention and Image Wordings

11/03/2020
by   Yue Wang, et al.
0

Social media produces large amounts of contents every day. To help users quickly capture what they need, keyphrase prediction is receiving a growing attention. Nevertheless, most prior efforts focus on text modeling, largely ignoring the rich features embedded in the matching images. In this work, we explore the joint effects of texts and images in predicting the keyphrases for a multimedia post. To better align social media style texts and images, we propose: (1) a novel Multi-Modality Multi-Head Attention (M3H-Att) to capture the intricate cross-media interactions; (2) image wordings, in forms of optical characters and image attributes, to bridge the two modalities. Moreover, we design a unified framework to leverage the outputs of keyphrase classification and generation and couple their advantages. Extensive experiments on a large-scale dataset newly collected from Twitter show that our model significantly outperforms the previous state of the art based on traditional attention networks. Further analyses show that our multi-head attention is able to attend information from various aspects and boost classification or generation in diverse scenarios.

READ FULL TEXT

page 1

page 6

page 9

page 13

page 14

research
02/26/2023

Understanding Social Media Cross-Modality Discourse in Linguistic Space

The multimedia communications with texts and images are popular on socia...
research
08/07/2017

Multimodal Classification for Analysing Social Media

Classification of social media data is an important approach in understa...
research
06/12/2023

UniPoll: A Unified Social Media Poll Generation Framework via Multi-Objective Optimization

Social media platforms are essential outlets for expressing opinions, pr...
research
09/01/2021

Point-of-Interest Type Prediction using Text and Images

Point-of-interest (POI) type prediction is the task of inferring the typ...
research
01/30/2018

The New Modality: Emoji Challenges in Prediction, Anticipation, and Retrieval

Over the past decade, emoji have emerged as a new and widespread form of...
research
10/15/2018

Super Characters: A Conversion from Sentiment Classification to Image Classification

We propose a method named Super Characters for sentiment classification....
research
01/14/2021

Interpretable Multi-Head Self-Attention model for Sarcasm Detection in social media

Sarcasm is a linguistic expression often used to communicate the opposit...

Please sign up or login with your details

Forgot password? Click here to reset