Diversify Your Vision Datasets with Automatic Diffusion-Based Augmentation

05/25/2023
by   Lisa Dunlap, et al.
0

Many fine-grained classification tasks, like rare animal identification, have limited training data and consequently classifiers trained on these datasets often fail to generalize to variations in the domain like changes in weather or location. As such, we explore how natural language descriptions of the domains seen in training data can be used with large vision models trained on diverse pretraining datasets to generate useful variations of the training data. We introduce ALIA (Automated Language-guided Image Augmentation), a method which utilizes large vision and language models to automatically generate natural language descriptions of a dataset's domains and augment the training data via language-guided image editing. To maintain data integrity, a model trained on the original dataset filters out minimal image edits and those which corrupt class-relevant information. The resulting dataset is visually consistent with the original training data and offers significantly enhanced diversity. On fine-grained and cluttered datasets for classification and detection, ALIA surpasses traditional data augmentation and text-to-image generated data by up to 15%, often even outperforming equivalent additions of real data. Code is avilable at https://github.com/lisadunlap/ALIA.

READ FULL TEXT

page 2

page 6

page 8

page 12

page 14

page 15

page 16

page 17

research
07/21/2023

Generating Image-Specific Text Improves Fine-grained Image Classification

Recent vision-language models outperform vision-only models on many imag...
research
08/19/2023

ASPIRE: Language-Guided Augmentation for Robust Image Classification

Neural image classifiers can often learn to make predictions by overly r...
research
05/21/2023

PiVe: Prompting with Iterative Verification Improving Graph-based Generative Capability of LLMs

Large language models (LLMs) have shown great abilities of solving vario...
research
08/22/2022

Incorporating Domain Knowledge through Task Augmentation for Front-End JavaScript Code Generation

Code generation aims to generate a code snippet automatically from natur...
research
08/14/2023

Diffusion Based Augmentation for Captioning and Retrieval in Cultural Heritage

Cultural heritage applications and advanced machine learning models are ...
research
07/21/2023

Enhancing CLIP with GPT-4: Harnessing Visual Descriptions as Prompts

Contrastive pretrained large Vision-Language Models (VLMs) like CLIP hav...
research
09/26/2019

Learning the Difference that Makes a Difference with Counterfactually-Augmented Data

Despite alarm over the reliance of machine learning systems on so-called...

Please sign up or login with your details

Forgot password? Click here to reset