Semantic-Guided Image Augmentation with Pre-trained Models

02/04/2023
by   Bohan Li, et al.
0

Image augmentation is a common mechanism to alleviate data scarcity in computer vision. Existing image augmentation methods often apply pre-defined transformations or mixup to augment the original image, but only locally vary the image. This makes them struggle to find a balance between maintaining semantic information and improving the diversity of augmented images. In this paper, we propose a Semantic-guided Image augmentation method with Pre-trained models (SIP). Specifically, SIP constructs prompts with image labels and captions to better guide the image-to-image generation process of the pre-trained Stable Diffusion model. The semantic information contained in the original images can be well preserved, and the augmented images still maintain diversity. Experimental results show that SIP can improve two commonly used backbones, i.e., ResNet-50 and ViT, by 12.60 datasets, respectively. Moreover, SIP not only outperforms the best image augmentation baseline RandAugment by 4.46 further improves the performance by integrating naturally with the baseline. A detailed analysis of SIP is presented, including the diversity of augmented images, an ablation study on textual prompts, and a case study on the generated images.

READ FULL TEXT

page 3

page 5

page 7

page 10

page 11

page 13

page 14

page 15

research
03/04/2020

Data Augmentation using Pre-trained Transformer Models

Language model based pre-trained models such as BERT have provided signi...
research
03/11/2023

AugDiff: Diffusion based Feature Augmentation for Multiple Instance Learning in Whole Slide Image

Multiple Instance Learning (MIL), a powerful strategy for weakly supervi...
research
06/01/2023

Can Large Pre-trained Models Help Vision Models on Perception Tasks?

The recent upsurge in pre-trained large models (e.g. GPT-4) has swept ac...
research
09/13/2021

Virtual Data Augmentation: A Robust and General Framework for Fine-tuning Pre-trained Models

Recent works have shown that powerful pre-trained language models (PLM) ...
research
08/24/2023

Dense Text-to-Image Generation with Attention Modulation

Existing text-to-image diffusion models struggle to synthesize realistic...
research
11/25/2020

Improving Augmentation and Evaluation Schemes for Semantic Image Synthesis

Despite data augmentation being a de facto technique for boosting the pe...
research
09/23/2022

An artificial neural network-based system for detecting machine failures using tiny sound data: A case study

In an effort to advocate the research for a deep learning-based machine ...

Please sign up or login with your details

Forgot password? Click here to reset