Equal But Not The Same: Understanding the Implicit Relationship Between Persuasive Images and Text

07/21/2018
by   Mingda Zhang, et al.
0

Images and text in advertisements interact in complex, non-literal ways. The two channels are usually complementary, with each channel telling a different part of the story. Current approaches, such as image captioning methods, only examine literal, redundant relationships, where image and text show exactly the same content. To understand more complex relationships, we first collect a dataset of advertisement interpretations for whether the image and slogan in the same visual advertisement form a parallel (conveying the same message without literally saying the same thing) or non-parallel relationship, with the help of workers recruited on Amazon Mechanical Turk. We develop a variety of features that capture the creativity of images and the specificity or ambiguity of text, as well as methods that analyze the semantics within and across channels. We show that our method outperforms standard image-text alignment approaches on predicting the parallel/non-parallel relationship between image and text.

READ FULL TEXT

page 2

page 5

page 10

research
12/23/2022

Do DALL-E and Flamingo Understand Each Other?

A major goal of multimodal research is to improve machine understanding ...
research
03/24/2020

TextCaps: a Dataset for Image Captioning with Reading Comprehension

Image descriptions can help visually impaired people to quickly understa...
research
10/27/2019

Leveraging Auxiliary Text for Deep Recognition of Unseen Visual Relationships

One of the most difficult tasks in scene understanding is recognizing in...
research
09/05/2019

Deep Visual Template-Free Form Parsing

Automatic, template-free extraction of information from form images is c...
research
05/04/2023

Image Captioners Sometimes Tell More Than Images They See

Image captioning, a.k.a. "image-to-text," which generates descriptive te...
research
06/20/2019

Understanding, Categorizing and Predicting Semantic Image-Text Relations

Two modalities are often used to convey information in a complementary a...
research
01/14/2019

Image Based Review Text Generation with Emotional Guidance

In the current field of computer vision, automatically generating texts ...

Please sign up or login with your details

Forgot password? Click here to reset