Minimax Rate of Distribution Estimation on Unknown Submanifold under Adversarial Losses
Statistical inference from high-dimensional data with low-dimensional structures has recently attracted lots of attention. In machine learning, deep generative modeling approaches implicitly estimate distributions of complex objects by creating new samples from the underlying distribution, and have achieved great success in generating synthetic realistic-looking images and texts. A key step in these approaches is the extraction of latent features or representations (encoding) that can be used for accurately reconstructing the original data (decoding). In other words, low-dimensional manifold structure is implicitly assumed and utilized in the distribution modeling and estimation. To understand the benefit of low-dimensional manifold structure in generative modeling, we build a general minimax framework for distribution estimation on unknown submanifold under adversarial losses, with suitable smoothness assumptions on the target distribution and the manifold. The established minimax rate elucidates how various problem characteristics, including intrinsic dimensionality of the data and smoothness levels of the target distribution and the manifold, affect the fundamental limit of high-dimensional distribution estimation. To prove the minimax upper bound, we construct an estimator based on a mixture of locally fitted generative models, which is motivated by the partition of unity technique from differential geometry and is necessary to cover cases where the underlying data manifold does not admit a global parametrization. We also propose a data-driven adaptive estimator that is shown to simultaneously attain within a logarithmic factor of the optimal rate over a large collection of distribution classes.
READ FULL TEXT