Look, Read and Feel: Benchmarking Ads Understanding with Multimodal Multitask Learning

12/21/2019
by   Huaizheng Zhang, et al.
0

Given the massive market of advertising and the sharply increasing online multimedia content (such as videos), it is now fashionable to promote advertisements (ads) together with the multimedia content. It is exhausted to find relevant ads to match the provided content manually, and hence, some automatic advertising techniques are developed. Since ads are usually hard to understand only according to its visual appearance due to the contained visual metaphor, some other modalities, such as the contained texts, should be exploited for understanding. To further improve user experience, it is necessary to understand both the topic and sentiment of the ads. This motivates us to develop a novel deep multimodal multitask framework to integrate multiple modalities to achieve effective topic and sentiment prediction simultaneously for ads understanding. In particular, our model first extracts multimodal information from ads and learn high-level and comparable representations. The visual metaphor of the ad is decoded in an unsupervised manner. The obtained representations are then fed into the proposed hierarchical multimodal attention modules to learn task-specific representations for final prediction. A multitask loss function is also designed to train both the topic and sentiment prediction models jointly in an end-to-end manner. We conduct extensive experiments on the latest and large advertisement dataset and achieve state-of-the-art performance for both prediction tasks. The obtained results could be utilized as a benchmark for ads understanding.

READ FULL TEXT
research
07/11/2018

Seq2Seq2Sentiment: Multimodal Sequence to Sequence Models for Sentiment Analysis

Multimodal machine learning is a core research area spanning the languag...
research
02/03/2018

Multimodal Sentiment Analysis with Word-Level Fusion and Reinforcement Learning

With the increasing popularity of video sharing websites such as YouTube...
research
12/04/2021

Channel Exchanging Networks for Multimodal and Multitask Dense Image Prediction

Multimodal fusion and multitask learning are two vital topics in machine...
research
12/15/2020

A Deep Multi-Level Attentive network for Multimodal Sentiment Analysis

Multimodal sentiment analysis has attracted increasing attention with br...
research
12/06/2020

Pedestrian Behavior Prediction via Multitask Learning and Categorical Interaction Modeling

Pedestrian behavior prediction is one of the major challenges for intell...
research
09/01/2023

Long-Term Memorability On Advertisements

Marketers spend billions of dollars on advertisements but to what end? A...
research
08/14/2020

MLM: A Benchmark Dataset for Multitask Learning with Multiple Languages and Modalities

In this paper, we introduce the MLM (Multiple Languages and Modalities) ...

Please sign up or login with your details

Forgot password? Click here to reset