DALL-E for Detection: Language-driven Context Image Synthesis for Object Detection

06/20/2022
by   Yunhao Ge, et al.
27

Object cut-and-paste has become a promising approach to efficiently generate large sets of labeled training data. It involves compositing foreground object masks onto background images. The background images, when congruent with the objects, provide helpful context information for training object recognition models. While the approach can easily generate large labeled data, finding congruent context images for downstream tasks has remained an elusive problem. In this work, we propose a new paradigm for automatic context image generation at scale. At the core of our approach lies utilizing an interplay between language description of context and language-driven image generation. Language description of a context is provided by applying an image captioning method on a small set of images representing the context. These language descriptions are then used to generate diverse sets of context images using the language-based DALL-E image generation framework. These are then composited with objects to provide an augmented training set for a classifier. We demonstrate the advantages of our approach over the prior context image generation approaches on four object detection datasets. Furthermore, we also highlight the compositional nature of our data generation approach on out-of-distribution and zero-shot data generation scenarios.

READ FULL TEXT

page 10

page 16

page 23

page 24

page 25

page 26

page 27

page 28

research
09/12/2023

Beyond Generation: Harnessing Text to Image Models for Object Detection and Segmentation

We propose a new paradigm to automatically generate training data with a...
research
06/24/2023

DesCo: Learning Object Recognition with Rich Language Descriptions

Recent development in vision-language approaches has instigated a paradi...
research
07/31/2019

Image Captioning with Unseen Objects

Image caption generation is a long standing and challenging problem at t...
research
07/09/2018

Generating objects going well with the surroundings

Since the generative adversarial network has made a breakthrough in the ...
research
03/29/2019

Training Object Detectors on Synthetic Images Containing Reflecting Materials

One of the grand challenges of deep learning is the requirement to obtai...
research
03/28/2023

Variational Distribution Learning for Unsupervised Text-to-Image Generation

We propose a text-to-image generation algorithm based on deep neural net...
research
05/14/2017

GeneGAN: Learning Object Transfiguration and Attribute Subspace from Unpaired Data

Object Transfiguration replaces an object in an image with another objec...

Please sign up or login with your details

Forgot password? Click here to reset