Bridging the Gap: Enhancing the Utility of Synthetic Data via Post-Processing Techniques

05/17/2023
by   Andrea Lampis, et al.
0

Acquiring and annotating suitable datasets for training deep learning models is challenging. This often results in tedious and time-consuming efforts that can hinder research progress. However, generative models have emerged as a promising solution for generating synthetic datasets that can replace or augment real-world data. Despite this, the effectiveness of synthetic data is limited by their inability to fully capture the complexity and diversity of real-world data. To address this issue, we explore the use of Generative Adversarial Networks to generate synthetic datasets for training classifiers that are subsequently evaluated on real-world images. To improve the quality and diversity of the synthetic dataset, we propose three novel post-processing techniques: Dynamic Sample Filtering, Dynamic Dataset Recycle, and Expansion Trick. In addition, we introduce a pipeline called Gap Filler (GaFi), which applies these techniques in an optimal and coordinated manner to maximise classification accuracy on real-world data. Our experiments show that GaFi effectively reduces the gap with real-accuracy scores to an error of 2.03 1.78 respectively. These results represent a new state of the art in Classification Accuracy Score and highlight the effectiveness of post-processing techniques in improving the quality of synthetic datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/31/2022

Reducing the Amount of Real World Data for Object Detector Training with Synthetic Data

A number of studies have investigated the training of neural networks wi...
research
05/15/2023

DATED: Guidelines for Creating Synthetic Datasets for Engineering Design Applications

Exploiting the recent advancements in artificial intelligence, showcased...
research
04/25/2022

PhysioGAN: Training High Fidelity Generative Model for Physiological Sensor Readings

Generative models such as the variational autoencoder (VAE) and the gene...
research
11/16/2022

GLFF: Global and Local Feature Fusion for Face Forgery Detection

With the rapid development of deep generative models (such as Generative...
research
06/23/2023

Exploring the Potential of AI-Generated Synthetic Datasets: A Case Study on Telematics Data with ChatGPT

This research delves into the construction and utilization of synthetic ...
research
01/24/2019

Learning Neurosymbolic Generative Models via Program Synthesis

Significant strides have been made toward designing better generative mo...
research
05/30/2023

GraphCleaner: Detecting Mislabelled Samples in Popular Graph Learning Benchmarks

Label errors have been found to be prevalent in popular text, vision, an...

Please sign up or login with your details

Forgot password? Click here to reset