A supervised generative optimization approach for tabular data

09/10/2023
by   Fadi Hamad, et al.
0

Synthetic data generation has emerged as a crucial topic for financial institutions, driven by multiple factors, such as privacy protection and data augmentation. Many algorithms have been proposed for synthetic data generation but reaching the consensus on which method we should use for the specific data sets and use cases remains challenging. Moreover, the majority of existing approaches are “unsupervised” in the sense that they do not take into account the downstream task. To address these issues, this work presents a novel synthetic data generation framework. The framework integrates a supervised component tailored to the specific downstream task and employs a meta-learning approach to learn the optimal mixture distribution of existing synthetic distributions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/27/2023

On the Usefulness of Synthetic Tabular Data Generation

Despite recent advances in synthetic data generation, the scientific com...
research
05/24/2023

Post-processing Private Synthetic Data for Improving Utility on Selected Measures

Existing private synthetic data generation algorithms are agnostic to do...
research
01/03/2021

Copula Flows for Synthetic Data Generation

The ability to generate high-fidelity synthetic data is crucial when ava...
research
05/26/2023

On Consistent Bayesian Inference from Synthetic Data

Generating synthetic data, with or without differential privacy, has att...
research
05/16/2023

Synthetic data, real errors: how (not) to publish and use synthetic data

Generating synthetic data through generative models is gaining interest ...
research
06/10/2023

HIPODE: Enhancing Offline Reinforcement Learning with High-Quality Synthetic Data from a Policy-Decoupled Approach

Offline reinforcement learning (ORL) has gained attention as a means of ...
research
07/16/2023

MargCTGAN: A "Marginally” Better CTGAN for the Low Sample Regime

The potential of realistic and useful synthetic data is significant. How...

Please sign up or login with your details

Forgot password? Click here to reset