A data augmentation perspective on diffusion models and retrieval

04/20/2023
by   Max F. Burg, et al.
0

Diffusion models excel at generating photorealistic images from text-queries. Naturally, many approaches have been proposed to use these generative abilities to augment training datasets for downstream tasks, such as classification. However, diffusion models are themselves trained on large noisily supervised, but nonetheless, annotated datasets. It is an open question whether the generalization capabilities of diffusion models beyond using the additional data of the pre-training process for augmentation lead to improved downstream performance. We perform a systematic evaluation of existing methods to generate images from diffusion models and study new extensions to assess their benefit for data augmentation. While we find that personalizing diffusion models towards the target data outperforms simpler prompting strategies, we also show that using the training data of the diffusion model alone, via a simple nearest neighbor retrieval procedure, leads to even stronger downstream performance. Overall, our study probes the limitations of diffusion models for data augmentation but also highlights its potential in generating new training data to improve performance on simple downstream vision tasks.

READ FULL TEXT

page 3

page 7

page 12

page 13

page 14

page 15

research
06/16/2022

MixGen: A New Multi-Modal Data Augmentation

Data augmentation is a necessity to enhance data efficiency in deep lear...
research
06/12/2023

Controlling Text-to-Image Diffusion by Orthogonal Finetuning

Large text-to-image diffusion models have impressive capabilities in gen...
research
09/03/2023

ArSDM: Colonoscopy Images Synthesis with Adaptive Refinement Semantic Diffusion Models

Colonoscopy analysis, particularly automatic polyp segmentation and dete...
research
05/22/2023

GSURE-Based Diffusion Model Training with Corrupted Data

Diffusion models have demonstrated impressive results in both data gener...
research
02/20/2023

Cross-domain Compositing with Pretrained Diffusion Models

Diffusion models have enabled high-quality, conditional image editing ca...
research
12/11/2022

How to Backdoor Diffusion Models?

Diffusion models are state-of-the-art deep learning empowered generative...
research
04/19/2023

Denoising Diffusion Medical Models

In this study, we introduce a generative model that can synthesize a lar...

Please sign up or login with your details

Forgot password? Click here to reset