ZEST: Zero-shot Learning from Text Descriptions using Textual Similarity and Visual Summarization

10/07/2020
by   Tzuf Paz-Argaman, et al.
0

We study the problem of recognizing visual entities from the textual descriptions of their classes. Specifically, given birds' images with free-text descriptions of their species, we learn to classify images of previously-unseen species based on specie descriptions. This setup has been studied in the vision community under the name zero-shot learning from text, focusing on learning to transfer knowledge about visual aspects of birds from seen classes to previously-unseen ones. Here, we suggest focusing on the textual description and distilling from the description the most relevant information to effectively match visual features to the parts of the text that discuss them. Specifically, (1) we propose to leverage the similarity between species, reflected in the similarity between text descriptions of the species. (2) we derive visual summaries of the texts, i.e., extractive summaries that focus on the visual features that tend to be reflected in images. We propose a simple attention-based model augmented with the similarity and visual summaries components. Our empirical results consistently and significantly outperform the state-of-the-art on the largest benchmarks for text-based zero-shot learning, illustrating the critical importance of texts for zero-shot image-recognition.

READ FULL TEXT

page 1

page 4

research
12/04/2017

Imagine it for me: Generative Adversarial Approach for Zero-Shot Learning from Noisy Texts

Most existing zero-shot learning methods consider the problem as a visua...
research
10/27/2022

Text2Model: Model Induction for Zero-shot Generalization Using Task Descriptions

We study the problem of generating a training-free task-dependent visual...
research
09/04/2017

Link the head to the "beak": Zero Shot Learning from Noisy Text Description at Part Precision

In this paper, we study learning visual classifiers from unstructured te...
research
06/03/2022

Zero-Shot Bird Species Recognition by Learning from Field Guides

We exploit field guides to learn bird species recognition, in particular...
research
04/05/2016

Less is more: zero-shot learning from online textual documents with noise suppression

Classifying a visual concept merely from its associated online textual s...
research
06/29/2015

Tell and Predict: Kernel Classifier Prediction for Unseen Visual Classes from Unstructured Text Descriptions

In this paper we propose a framework for predicting kernelized classifie...
research
12/31/2015

Write a Classifier: Predicting Visual Classifiers from Unstructured Text

People typically learn through exposure to visual concepts associated wi...

Please sign up or login with your details

Forgot password? Click here to reset