DatasetDM: Synthesizing Data with Perception Annotations Using Diffusion Models

08/11/2023
by   Weijia Wu, et al.
0

Current deep networks are very data-hungry and benefit from training on largescale datasets, which are often time-consuming to collect and annotate. By contrast, synthetic data can be generated infinitely using generative models such as DALL-E and diffusion models, with minimal effort and cost. In this paper, we present DatasetDM, a generic dataset generation model that can produce diverse synthetic images and the corresponding high-quality perception annotations (e.g., segmentation masks, and depth). Our method builds upon the pre-trained diffusion model and extends text-guided image synthesis to perception data generation. We show that the rich latent code of the diffusion model can be effectively decoded as accurate perception annotations using a decoder module. Training the decoder only needs less than 1 images) manually labeled images, enabling the generation of an infinitely large annotated dataset. Then these synthetic data can be used for training various perception models for downstream tasks. To showcase the power of the proposed approach, we generate datasets with rich dense pixel-wise labels for a wide range of downstream tasks, including semantic segmentation, instance segmentation, and depth estimation. Notably, it achieves 1) state-of-the-art results on semantic segmentation and instance segmentation; 2) significantly more robust on domain generalization than using the real data alone; and state-of-the-art results in zero-shot segmentation setting; and 3) flexibility for efficient application and novel task composition (e.g., image editing). The project website and code can be found at https://weijiawu.github.io/DatasetDM_page/ and https://github.com/showlab/DatasetDM, respectively

READ FULL TEXT

page 2

page 4

page 6

page 7

page 16

page 17

page 19

page 20

research
03/21/2023

DiffuMask: Synthesizing Images with Pixel-level Annotations for Semantic Segmentation Using Diffusion Models

Collecting and annotating images with pixel-wise labels is time-consumin...
research
12/24/2022

HandsOff: Labeled Dataset Generation With No Additional Human Annotations

Recent work leverages the expressive power of generative adversarial net...
research
12/21/2022

Generalized Decoding for Pixel, Image, and Language

We present X-Decoder, a generalized decoding model that can predict pixe...
research
04/13/2021

DatasetGAN: Efficient Labeled Data Factory with Minimal Human Effort

We introduce DatasetGAN: an automatic procedure to generate massive data...
research
06/27/2023

MIMIC: Masked Image Modeling with Image Correspondences

Many pixelwise dense prediction tasks-depth estimation and semantic segm...
research
12/02/2021

PartImageNet: A Large, High-Quality Dataset of Parts

A part-based object understanding facilitates efficient compositional le...
research
03/09/2022

All You Need is LUV: Unsupervised Collection of Labeled Images using Invisible UV Fluorescent Indicators

Large-scale semantic image annotation is a significant challenge for lea...

Please sign up or login with your details

Forgot password? Click here to reset