Data generator based on RBF network

03/28/2014
by   Marko Robnik-Šikonja, et al.
0

There are plenty of problems where the data available is scarce and expensive. We propose a generator of semi-artificial data with similar properties to the original data which enables development and testing of different data mining algorithms and optimization of their parameters. The generated data allow a large scale experimentation and simulations without danger of overfitting. The proposed generator is based on RBF networks which learn sets of Gaussian kernels. Learned Gaussian kernels can be used in a generative mode to generate the data from the same distributions. To asses quality of the generated data we developed several workflows and used them to evaluate the statistical properties of the generated data, structural similarity, and predictive similarity using supervised and unsupervised learning techniques. To determine usability of the proposed generator we conducted a large scale evaluation using 51 UCI data sets. The results show a considerable similarity between the original and generated data and indicate that the method can be useful in several development and simulation scenarios.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/12/2019

Generating Data using Monte Carlo Dropout

For many analytical problems the challenge is to handle huge amounts of ...
research
04/25/2017

Introspective Generative Modeling: Decide Discriminatively

We study unsupervised learning by developing introspective generative mo...
research
04/11/2023

GraphGANFed: A Federated Generative Framework for Graph-Structured Molecules Towards Efficient Drug Discovery

Recent advances in deep learning have accelerated its use in various app...
research
04/18/2023

Computational and Exploratory Landscape Analysis of the GKLS Generator

The GKLS generator is one of the most used testbeds for benchmarking glo...
research
06/27/2012

Bayesian Efficient Multiple Kernel Learning

Multiple kernel learning algorithms are proposed to combine kernels in o...
research
11/05/2022

Improved Techniques for the Conditional Generative Augmentation of Clinical Audio Data

Data augmentation is a valuable tool for the design of deep learning sys...

Please sign up or login with your details

Forgot password? Click here to reset