The Sketched Wasserstein Distance for mixture distributions

06/26/2022
โˆ™
by   Xin Bing, et al.
โˆ™
0
โˆ™

The Sketched Wasserstein Distance (W^S) is a new probability distance specifically tailored to finite mixture distributions. Given any metric d defined on a set ๐’œ of probability distributions, W^S is defined to be the most discriminative convex extension of this metric to the space ๐’ฎ = conv(๐’œ) of mixtures of elements of ๐’œ. Our representation theorem shows that the space (๐’ฎ, W^S) constructed in this way is isomorphic to a Wasserstein space over ๐’ณ = (๐’œ, d). This result establishes a universality property for the Wasserstein distances, revealing them to be uniquely characterized by their discriminative power for finite mixtures. We exploit this representation theorem to propose an estimation methodology based on Kantorovichโ€“Rubenstein duality, and prove a general theorem that shows that its estimation error can be bounded by the sum of the errors of estimating the mixture weights and the mixture components, for any estimators of these quantities. We derive sharp statistical properties for the estimated W^S in the case of p-dimensional discrete K-mixtures, which we show can be estimated at a rate proportional to โˆš(K/N), up to logarithmic factors. We complement these bounds with a minimax lower bound on the risk of estimating the Wasserstein distance between distributions on a K-point metric space, which matches our upper bound up to logarithmic factors. This result is the first nearly tight minimax lower bound for estimating the Wasserstein distance between discrete distributions. Furthermore, we construct โˆš(N) asymptotically normal estimators of the mixture weights, and derive a โˆš(N) distributional limit of our estimator of W^S as a consequence. Simulation studies and a data analysis provide strong support on the applicability of the new Sketched Wasserstein Distance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
โˆ™ 02/24/2018

Minimax Distribution Estimation in Wasserstein Distance

The Wasserstein metric is an important measure of distance between proba...
research
โˆ™ 09/16/2019

Estimation of Wasserstein distances in the Spiked Transport Model

We propose a new statistical model, the spiked transport model, which fo...
research
โˆ™ 02/02/2023

Stone's theorem for distributional regression in Wasserstein distance

We extend the celebrated Stone's theorem to the framework of distributio...
research
โˆ™ 07/03/2021

Minimum Wasserstein Distance Estimator under Finite Location-scale Mixtures

When a population exhibits heterogeneity, we often model it via a finite...
research
โˆ™ 11/18/2021

Bounds in L^1 Wasserstein distance on the normal approximation of general M-estimators

We derive quantitative bounds on the rate of convergence in L^1 Wasserst...
research
โˆ™ 05/27/2021

Stein's Method for Probability Distributions on ๐•Š^1

In this paper, we propose a modification to the density approach to Stei...
research
โˆ™ 07/12/2021

Likelihood estimation of sparse topic distributions in topic models and its applications to Wasserstein document distance calculations

This paper studies the estimation of high-dimensional, discrete, possibl...

Please sign up or login with your details

Forgot password? Click here to reset