DeepAI AI Chat
Log In Sign Up

The New Modality: Emoji Challenges in Prediction, Anticipation, and Retrieval

by   Spencer Cappallo, et al.

Over the past decade, emoji have emerged as a new and widespread form of digital communication, spanning diverse social networks and spoken languages. We propose to treat these ideograms as a new modality in their own right, distinct in their semantic structure from both the text in which they are often embedded as well as the images which they resemble. As a new modality, emoji present rich novel possibilities for representation and interaction. In this paper, we explore the challenges that arise naturally from considering the emoji modality through the lens of multimedia research. Specifically, the ways in which emoji can be related to other common modalities such as text and images. To do so, we first present a large scale dataset of real-world emoji usage collected from Twitter. This dataset contains examples of both text-emoji and image-emoji relationships. We present baseline results on the challenge of predicting emoji from both text and images, using state-of-the-art neural networks. Further, we offer a first consideration into the problem of how to account for new, unseen emoji - a relevant issue as the emoji vocabulary continues to expand on a yearly basis. Finally, we present results for multimedia retrieval using emoji as queries.


page 2

page 6

page 12


Mr. Right: Multimodal Retrieval on Representation of ImaGe witH Text

Multimodal learning is a recent challenge that extends unimodal learning...

Cross-Media Keyphrase Prediction: A Unified Framework with Multi-Modality Multi-Head Attention and Image Wordings

Social media produces large amounts of contents every day. To help users...

Analysis of Social Media Data using Multimodal Deep Learning for Disaster Response

Multimedia content in social media platforms provides significant inform...

Understanding Social Media Cross-Modality Discourse in Linguistic Space

The multimedia communications with texts and images are popular on socia...

Zero-resource Machine Translation by Multimodal Encoder-decoder Network with Multimedia Pivot

We propose an approach to build a neural machine translation system with...

Efficient Multimedia Similarity Measurement Using Similar Elements

Online social networking techniques and large-scale multimedia systems a...

M2FN: Multi-step Modality Fusion for Advertisement Image Assessment

Assessing advertisements, specifically on the basis of user preferences ...