LSUN: Construction of a Large-scale Image Dataset using Deep Learning with Humans in the Loop

by   Fisher Yu, et al.

While there has been remarkable progress in the performance of visual recognition algorithms, the state-of-the-art models tend to be exceptionally data-hungry. Large labeled training datasets, expensive and tedious to produce, are required to optimize millions of parameters in deep network models. Lagging behind the growth in model capacity, the available datasets are quickly becoming outdated in terms of size and density. To circumvent this bottleneck, we propose to amplify human effort through a partially automated labeling scheme, leveraging deep learning with humans in the loop. Starting from a large set of candidate images for each category, we iteratively sample a subset, ask people to label them, classify the others with a trained model, split the set into positives, negatives, and unlabeled based on the classification confidence, and then iterate with the unlabeled set. To assess the effectiveness of this cascading procedure and enable further progress in visual recognition research, we construct a new image dataset, LSUN. It contains around one million labeled images for each of 10 scene categories and 20 object categories. We experiment with training popular convolutional networks and find that they achieve substantial performance gains when trained on this dataset.


page 3

page 5

page 7

page 8


VISER: Visual Self-Regularization

In this work, we propose the use of large set of unlabeled images as a s...

Comparison Knowledge Translation for Generalizable Image Classification

Deep learning has recently achieved remarkable performance in image clas...

Fine-grained Categorization and Dataset Bootstrapping using Deep Metric Learning with Humans in the Loop

Existing fine-grained visual categorization methods often suffer from th...

Training with the Invisibles: Obfuscating Images to Share Safely for Learning Visual Recognition Models

High-performance visual recognition systems generally require a large co...

VASR: Visual Analogies of Situation Recognition

A core process in human cognition is analogical mapping: the ability to ...

Background Splitting: Finding Rare Classes in a Sea of Background

We focus on the real-world problem of training accurate deep models for ...

Deep Cervix Model Development from Heterogeneous and Partially Labeled Image Datasets

Cervical cancer is the fourth most common cancer in women worldwide. The...

Please sign up or login with your details

Forgot password? Click here to reset