Separating Self-Expression and Visual Content in Hashtag Supervision

11/27/2017
by   Andreas Veit, et al.
0

The variety, abundance, and structured nature of hashtags make them an interesting data source for training vision models. For instance, hashtags have the potential to significantly reduce the problem of manual supervision and annotation when learning vision models for a large number of concepts. However, a key challenge when learning from hashtags is that they are inherently subjective because they are provided by users as a form of self-expression. As a consequence, hashtags may have synonyms (different hashtags referring to the same visual content) and may be ambiguous (the same hashtag referring to different visual content). These challenges limit the effectiveness of approaches that simply treat hashtags as image-label pairs. This paper presents an approach that extends upon modeling simple image-label pairs by modeling the joint distribution of images, hashtags, and users. We demonstrate the efficacy of such approaches in image tagging and retrieval experiments, and show how the joint model can be used to perform user-conditional retrieval and tagging.

READ FULL TEXT

page 1

page 5

page 6

page 7

page 10

page 11

research
09/02/2019

VISIR: Visual and Semantic Image Label Refinement

The social media explosion has populated the Internet with a wealth of i...
research
12/10/2020

Tensor Composition Net for Visual Relationship Prediction

We present a novel Tensor Composition Network (TCN) to predict visual re...
research
06/09/2023

Read, look and detect: Bounding box annotation from image-caption pairs

Various methods have been proposed to detect objects while reducing the ...
research
11/21/2018

Facilitating the Manual Annotation of Sounds When Using Large Taxonomies

Properly annotated multimedia content is crucial for supporting advances...
research
11/21/2016

Sampled Image Tagging and Retrieval Methods on User Generated Content

Traditional image tagging and retrieval algorithms have limited value as...
research
03/17/2018

Learning Unsupervised Visual Grounding Through Semantic Self-Supervision

Localizing natural language phrases in images is a challenging problem t...

Please sign up or login with your details

Forgot password? Click here to reset