Towards Automatic Construction of Diverse, High-quality Image Dataset

08/22/2017
by   Yazhou Yao, et al.
0

The availability of labeled image datasets has been shown critical for high-level image understanding, which continuously drives the progress of feature designing and models developing. However, constructing labeled image datasets is laborious and monotonous. To eliminate manual annotation, in this work, we propose a novel image dataset construction framework by employing multiple textual metadata. We aim at collecting diverse and accurate images for given queries from the Web. Specifically, we formulate noisy textual metadata removing and noisy images filtering as a multi-view and multi-instance learning problem separately. Our proposed approach not only improves the accuracy but also enhances the diversity of the selected images. To verify the effectiveness of our proposed approach, we construct an image dataset with 100 categories. The experiments show significant performance gains by using the generated data of our approach on several tasks, such as image classification, cross-dataset generalization, and object detection. The proposed method also consistently outperforms existing weakly supervised and web-supervised approaches.

READ FULL TEXT

page 3

page 5

page 8

research
11/22/2016

Exploiting Web Images for Dataset Construction: A Domain Robust Approach

Labelled image datasets have played a critical role in high-level image ...
research
03/16/2017

Refining Image Categorization by Exploiting Web Images and General Corpus

Studies show that refining real-world categories into semantic subcatego...
research
05/02/2017

Transfer Learning by Ranking for Weakly Supervised Object Annotation

Most existing approaches to training object detectors rely on fully supe...
research
02/27/2018

Mixed Supervised Object Detection with Robust Objectness Transfer

In this paper, we consider the problem of leveraging existing fully labe...
research
10/06/2020

Weakly-Supervised Feature Learning via Text and Image Matching

When training deep neural networks for medical image classification, obt...
research
11/29/2021

Weakly-supervised Generative Adversarial Networks for medical image classification

Weakly-supervised learning has become a popular technology in recent yea...
research
06/07/2023

Enhancing Virtual Assistant Intelligence: Precise Area Targeting for Instance-level User Intents beyond Metadata

Virtual assistants have been widely used by mobile phone users in recent...

Please sign up or login with your details

Forgot password? Click here to reset