Guaranteed Optimal Generative Modeling with Maximum Deviation from the Empirical Distribution

07/31/2023
by   Elen Vardanyan, et al.
0

Generative modeling is a widely-used machine learning method with various applications in scientific and industrial fields. Its primary objective is to simulate new examples drawn from an unknown distribution given training data while ensuring diversity and avoiding replication of examples from the training data. This paper presents theoretical insights into training a generative model with two properties: (i) the error of replacing the true data-generating distribution with the trained data-generating distribution should optimally converge to zero as the sample size approaches infinity, and (ii) the trained data-generating distribution should be far enough from any distribution replicating examples in the training data. We provide non-asymptotic results in the form of finite sample risk bounds that quantify these properties and depend on relevant parameters such as sample size, the dimension of the ambient space, and the dimension of the latent space. Our results are applicable to general integral probability metrics used to quantify errors in probability distribution spaces, with the Wasserstein-1 distance being the central example. We also include numerical examples to illustrate our theoretical findings.

READ FULL TEXT
research
01/08/2022

Optimal 1-Wasserstein Distance for WGANs

The mathematical forces at work behind Generative Adversarial Networks r...
research
08/27/2020

Analytical and statistical properties of local depth functions motivated by clustering applications

Local depth functions (LDFs) are used for describing the local geometric...
research
10/10/2019

Rate-Distortion Optimization Guided Autoencoder for Generative Approach with quantitatively measurable latent space

In the generative model approach of machine learning, it is essential to...
research
05/24/2019

Likelihood ratio tests for many groups in high dimensions

In this paper we investigate the asymptotic distribution of likelihood r...
research
08/13/2018

A Matching Based Theoretical Framework for Estimating Probability of Causation

The concept of Probability of Causation (PC) is critically important in ...
research
10/19/2020

Statistical guarantees for generative models without domination

In this paper, we introduce a convenient framework for studying (adversa...
research
02/28/2023

Generating Accurate Virtual Examples For Lifelong Machine Learning

Lifelong machine learning (LML) is an area of machine learning research ...

Please sign up or login with your details

Forgot password? Click here to reset