Improving the quality of generative models through Smirnov transformation

10/29/2021
by   Ángel González-Prieto, et al.
16

Solving the convergence issues of Generative Adversarial Networks (GANs) is one of the most outstanding problems in generative models. In this work, we propose a novel activation function to be used as output of the generator agent. This activation function is based on the Smirnov probabilistic transformation and it is specifically designed to improve the quality of the generated data. In sharp contrast with previous works, our activation function provides a more general approach that deals not only with the replication of categorical variables but with any type of data distribution (continuous or discrete). Moreover, our activation function is derivable and therefore, it can be seamlessly integrated in the backpropagation computations during the GAN training processes. To validate this approach, we evaluate our proposal against two different data sets: a) an artificially rendered data set containing a mixture of discrete and continuous variables, and b) a real data set of flow-based network traffic data containing both normal connections and cryptomining attacks. To evaluate the fidelity of the generated data, we analyze both their results in terms of quality measures of statistical nature and also regarding the use of these synthetic data to feed a nested machine learning-based classifier. The experimental results evince a clear outperformance of the GAN network tuned with this new activation function with respect to both a naïve mean-based generator and a standard GAN. The quality of the data is so high that the generated data can fully substitute real data for training the nested classifier without a fall in the obtained accuracy. This result encourages the use of GANs to produce high-quality synthetic data that are applicable in scenarios in which data privacy must be guaranteed.

READ FULL TEXT

page 18

page 26

page 28

research
07/30/2021

Synthetic flow-based cryptomining attack generation through Generative Adversarial Networks

Due to the growing rise of cyber attacks in the Internet, flow-based dat...
research
01/03/2021

Copula Flows for Synthetic Data Generation

The ability to generate high-fidelity synthetic data is crucial when ava...
research
09/27/2018

Flow-based Network Traffic Generation using Generative Adversarial Networks

Flow-based data sets are necessary for evaluating network-based intrusio...
research
05/09/2018

Improving GAN Training via Binarized Representation Entropy (BRE) Regularization

We propose a novel regularizer to improve the training of Generative Adv...
research
07/01/2023

CasTGAN: Cascaded Generative Adversarial Network for Realistic Tabular Data Synthesis

Generative adversarial networks (GANs) have drawn considerable attention...
research
06/07/2020

Generating Realistic Stock Market Order Streams

We propose an approach to generate realistic and high-fidelity stock mar...
research
04/06/2020

Leveraging GANs to Improve Continuous Path Keyboard Input Models

Continuous path keyboard input has higher inherent ambiguity than standa...

Please sign up or login with your details

Forgot password? Click here to reset