Improving Few-Shot Image Classification Using Machine- and User-Generated Natural Language Descriptions

07/07/2022
by   Kosuke Nishida, et al.
0

Humans can obtain the knowledge of novel visual concepts from language descriptions, and we thus use the few-shot image classification task to investigate whether a machine learning model can have this capability. Our proposed model, LIDE (Learning from Image and DEscription), has a text decoder to generate the descriptions and a text encoder to obtain the text representations of machine- or user-generated descriptions. We confirmed that LIDE with machine-generated descriptions outperformed baseline models. Moreover, the performance was improved further with high-quality user-generated descriptions. The generated descriptions can be viewed as the explanations of the model's predictions, and we observed that such explanations were consistent with prediction results. We also investigated why the language description improved the few-shot image classification performance by comparing the image representations and the text representations in the feature spaces.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/05/2022

I2MVFormer: Large Language Model Generated Multi-View Document Supervision for Zero-Shot Image Classification

Recent works have shown that unstructured text (documents) from online s...
research
11/06/2019

Shaping Visual Representations with Language for Few-shot Classification

Language is designed to convey useful information about the world, thus ...
research
01/28/2022

Summarizing Differences between Text Distributions with Natural Language

How do two distributions of texts differ? Humans are slow at answering t...
research
09/27/2021

Introducing the viewpoint in the resource description using machine learning

Search engines allow providing the user with data information according ...
research
03/07/2023

Describe me an Aucklet: Generating Grounded Perceptual Category Descriptions

Human language users can generate descriptions of perceptual concepts be...
research
11/01/2017

Learning with Latent Language

The named concepts and compositional operators present in natural langua...
research
04/13/2017

Room for improvement in automatic image description: an error analysis

In recent years we have seen rapid and significant progress in automatic...

Please sign up or login with your details

Forgot password? Click here to reset