WebVision Database: Visual Learning and Understanding from Web Data

by   Wen Li, et al.

In this paper, we present a study on learning visual recognition models from large scale noisy web data. We build a new database called WebVision, which contains more than 2.4 million web images crawled from the Internet by using queries generated from the 1,000 semantic concepts of the benchmark ILSVRC 2012 dataset. Meta information along with those web images (e.g., title, description, tags, etc.) are also crawled. A validation set and test set containing human annotated images are also provided to facilitate algorithmic development. Based on our new database, we obtain a few interesting observations: 1) the noisy web images are sufficient for training a good deep CNN model for visual recognition; 2) the model learnt from our WebVision database exhibits comparable or even better generalization ability than the one trained from the ILSVRC 2012 dataset when being transferred to new datasets and tasks; 3) a domain adaptation issue (a.k.a., dataset bias) is observed, which means the dataset can be used as the largest benchmark dataset for visual domain adaptation. Our new WebVision database and relevant studies in this work would benefit the advance of learning state-of-the-art visual models with minimum supervision based on web data.


page 3

page 4

page 5


WebVision Challenge: Visual Learning and Understanding With Web Data

We present the 2017 WebVision Challenge, a public image recognition chal...

Syn2Real: A New Benchmark forSynthetic-to-Real Visual Domain Adaptation

Unsupervised transfer of object recognition models from synthetic to rea...

Training Convolutional Networks with Web Images

In this thesis we investigate the effect of using web images to build a ...

Learning Deep Visual Object Models From Noisy Web Data: How to Make it Work

Deep networks thrive when trained on large scale data collections. This ...

Exploiting Web Images for Dataset Construction: A Domain Robust Approach

Labelled image datasets have played a critical role in high-level image ...

Ambiguous Images With Human Judgments for Robust Visual Event Classification

Contemporary vision benchmarks predominantly consider tasks on which hum...

Development of Semantic Web-based Imaging Database for Biological Morphome

We introduce the RIKEN Microstructural Imaging Metadatabase, a semantic ...

Please sign up or login with your details

Forgot password? Click here to reset