Structure preservation via the Wasserstein distance

09/15/2022
by   Daniel Bartl, et al.
0

We show that under minimal assumptions on a random vector X∈ℝ^d, and with high probability, given m independent copies of X, the coordinate distribution of each vector (⟨ X_i,θ⟩)_i=1^m is dictated by the distribution of the true marginal ⟨ X,θ⟩. Formally, we show that with high probability, sup_θ∈ S^d-1( 1/m∑_i=1^m |⟨ X_i,θ⟩^♯ - λ^θ_i |^2 )^1/2≤ c ( d/m)^1/4, where λ^θ_i = m∫_(i-1/m, i/m] F_⟨ X,θ⟩^-1(u)^2 du and a^♯ denotes the monotone non-decreasing rearrangement of a. The proof follows from the optimal estimate on the worst Wasserstein distance between a marginal of X and its empirical counterpart, 1/m∑_i=1^m δ_⟨ X_i, θ⟩. We then use the accurate information on the structures of the vectors (⟨ X_i,θ⟩)_i=1^m to construct the first non-gaussian ensemble that yields the optimal estimate in the Dvoretzky-Milman Theorem: the ensemble exhibits almost Euclidean sections in arbitrary normed spaces of the same dimension as the gaussian embedding – despite being very far from gaussian (in fact, it happens to be heavy-tailed).

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/27/2020

Exact rate of convergence of the mean Wasserstein distance between the empirical and true Gaussian distribution

We study the Wasserstein distance W_2 for Gaussian samples. We establish...
research
11/05/2021

Why the 1-Wasserstein distance is the area between the two marginal CDFs

We elucidate why the 1-Wasserstein distance W_1 coincides with the area ...
research
04/17/2019

Stable recovery and the coordinate small-ball behaviour of random vectors

Recovery procedures in various application in Data Science are based on ...
research
03/09/2021

Column randomization and almost-isometric embeddings

The matrix A:ℝ^n →ℝ^m is (δ,k)-regular if for any k-sparse vector x, ...
research
07/16/2019

On the geometry of polytopes generated by heavy-tailed random vectors

We study the geometry of centrally-symmetric random polytopes, generated...
research
04/08/2022

Fast metric embedding into the Hamming cube

We consider the problem of embedding a subset of ℝ^n into a low-dimensio...
research
09/12/2019

On an enhancement of RNA probing data using Information Theory

Identifying the secondary structure of an RNA is crucial for understandi...

Please sign up or login with your details

Forgot password? Click here to reset