Task Bias in Vision-Language Models

12/08/2022
by   Sachit Menon, et al.
0

Incidental supervision from language has become a popular approach for learning generic visual representations that can be prompted to perform many recognition tasks in computer vision. We conduct an in-depth exploration of the CLIP model and show that its visual representation is often strongly biased towards solving some tasks more than others. Moreover, which task the representation will be biased towards is unpredictable, with little consistency across images. To resolve this task bias, we show how to learn a visual prompt that guides the representation towards features relevant to their task of interest. Our results show that these visual prompts can be independent of the input image and still effectively provide a conditioning mechanism to steer visual representations towards the desired task.

READ FULL TEXT

page 1

page 2

page 3

page 4

page 5

page 7

page 8

page 13

research
08/04/2020

Learning Visual Representations with Caption Annotations

Pretraining general-purpose visual features has become a crucial part of...
research
01/26/2022

Evaluating language-biased image classification based on semantic representations

Humans show language-biased image recognition for a word-embedded image,...
research
03/28/2018

Who Let The Dogs Out? Modeling Dog Behavior From Visual Data

We introduce the task of directly modeling a visually intelligent agent....
research
08/18/2021

Show or Tell? Visual and Verbal Representations Bias Position Recall

When we view visualizations, we not only have a visual representation of...
research
06/22/2014

Factors of Transferability for a Generic ConvNet Representation

Evidence is mounting that Convolutional Networks (ConvNets) are the most...
research
09/06/2021

Visual Recognition with Deep Learning from Biased Image Datasets

In practice, and more especially when training deep neural networks, vis...
research
05/06/2022

Prompt Distribution Learning

We present prompt distribution learning for effectively adapting a pre-t...

Please sign up or login with your details

Forgot password? Click here to reset