KAFA: Rethinking Image Ad Understanding with Knowledge-Augmented Feature Adaptation of Vision-Language Models

05/28/2023
by   Zhiwei Jia, et al.
0

Image ad understanding is a crucial task with wide real-world applications. Although highly challenging with the involvement of diverse atypical scenes, real-world entities, and reasoning over scene-texts, how to interpret image ads is relatively under-explored, especially in the era of foundational vision-language models (VLMs) featuring impressive generalizability and adaptability. In this paper, we perform the first empirical study of image ad understanding through the lens of pre-trained VLMs. We benchmark and reveal practical challenges in adapting these VLMs to image ad understanding. We propose a simple feature adaptation strategy to effectively fuse multimodal information for image ads and further empower it with knowledge of real-world entities. We hope our study draws more attention to image ad understanding which is broadly relevant to the advertising industry.

READ FULL TEXT

page 1

page 2

page 5

page 7

page 13

page 14

research
09/19/2022

How to Adapt Pre-trained Vision-and-Language Models to a Text-only Input?

Current language models have been criticised for learning language from ...
research
08/21/2019

A Contextual Bandit Algorithm for Ad Creative under Ad Fatigue

Selecting ad creative is one of the most important task for DSPs (Demand...
research
12/02/2021

DKPLM: Decomposable Knowledge-enhanced Pre-trained Language Model for Natural Language Understanding

Knowledge-Enhanced Pre-trained Language Models (KEPLMs) are pre-trained ...
research
12/06/2022

Counterfactual reasoning: Do language models need world knowledge for causal understanding?

Current pre-trained language models have enabled remarkable improvements...
research
05/26/2023

Counterfactual reasoning: Testing language models' understanding of hypothetical scenarios

Current pre-trained language models have enabled remarkable improvements...
research
08/17/2023

Chat-3D: Data-efficiently Tuning Large Language Model for Universal Dialogue of 3D Scenes

3D scene understanding has gained significant attention due to its wide ...
research
11/17/2017

ADVISE: Symbolism and External Knowledge for Decoding Advertisements

In order to convey the most content in their limited space, advertisemen...

Please sign up or login with your details

Forgot password? Click here to reset