Learning Visual N-Grams from Web Data

12/29/2016
by   Ang Li, et al.
0

Real-world image recognition systems need to recognize tens of thousands of classes that constitute a plethora of visual concepts. The traditional approach of annotating thousands of images per class for training is infeasible in such a scenario, prompting the use of webly supervised data. This paper explores the training of image-recognition systems on large numbers of images and associated user comments. In particular, we develop visual n-gram models that can predict arbitrary phrases that are relevant to the content of an image. Our visual n-gram models are feed-forward convolutional networks trained using new loss functions that are inspired by n-gram models commonly used in language modeling. We demonstrate the merits of our models in phrase prediction, phrase-based image retrieval, relating images and captions, and zero-shot transfer.

READ FULL TEXT

page 1

page 6

page 7

page 8

page 11

page 12

page 13

research
01/12/2015

Combining Language and Vision with a Multimodal Skip-gram Model

We extend the SKIP-GRAM model of Mikolov et al. (2013a) by taking visual...
research
05/18/2018

Self-Training Ensemble Networks for Zero-Shot Image Recognition

Despite the advancement of supervised image recognition algorithms, thei...
research
12/27/2021

A Fistful of Words: Learning Transferable Visual Models from Bag-of-Words Supervision

Using natural language as a supervision for training visual recognition ...
research
11/28/2021

Gram Barcodes for Histopathology Tissue Texture Retrieval

Recent advances in digital pathology have led to the need for Histopatho...
research
10/18/2022

Perceptual Grouping in Vision-Language Models

Recent advances in zero-shot image recognition suggest that vision-langu...
research
05/22/2017

Learning to Associate Words and Images Using a Large-scale Graph

We develop an approach for unsupervised learning of associations between...

Please sign up or login with your details

Forgot password? Click here to reset