Log In Sign Up

The Sketched Wasserstein Distance for mixture distributions

by   Xin Bing, et al.

The Sketched Wasserstein Distance (W^S) is a new probability distance specifically tailored to finite mixture distributions. Given any metric d defined on a set ๐’œ of probability distributions, W^S is defined to be the most discriminative convex extension of this metric to the space ๐’ฎ = conv(๐’œ) of mixtures of elements of ๐’œ. Our representation theorem shows that the space (๐’ฎ, W^S) constructed in this way is isomorphic to a Wasserstein space over ๐’ณ = (๐’œ, d). This result establishes a universality property for the Wasserstein distances, revealing them to be uniquely characterized by their discriminative power for finite mixtures. We exploit this representation theorem to propose an estimation methodology based on Kantorovichโ€“Rubenstein duality, and prove a general theorem that shows that its estimation error can be bounded by the sum of the errors of estimating the mixture weights and the mixture components, for any estimators of these quantities. We derive sharp statistical properties for the estimated W^S in the case of p-dimensional discrete K-mixtures, which we show can be estimated at a rate proportional to โˆš(K/N), up to logarithmic factors. We complement these bounds with a minimax lower bound on the risk of estimating the Wasserstein distance between distributions on a K-point metric space, which matches our upper bound up to logarithmic factors. This result is the first nearly tight minimax lower bound for estimating the Wasserstein distance between discrete distributions. Furthermore, we construct โˆš(N) asymptotically normal estimators of the mixture weights, and derive a โˆš(N) distributional limit of our estimator of W^S as a consequence. Simulation studies and a data analysis provide strong support on the applicability of the new Sketched Wasserstein Distance.


page 1

page 2

page 3

page 4

โˆ™ 02/24/2018

Minimax Distribution Estimation in Wasserstein Distance

The Wasserstein metric is an important measure of distance between proba...
โˆ™ 09/16/2019

Estimation of Wasserstein distances in the Spiked Transport Model

We propose a new statistical model, the spiked transport model, which fo...
โˆ™ 07/03/2021

Minimum Wasserstein Distance Estimator under Finite Location-scale Mixtures

When a population exhibits heterogeneity, we often model it via a finite...
โˆ™ 11/18/2021

Bounds in L^1 Wasserstein distance on the normal approximation of general M-estimators

We derive quantitative bounds on the rate of convergence in L^1 Wasserst...
โˆ™ 09/30/2022

Finding NEEMo: Geometric Fitting using Neural Estimation of the Energy Mover's Distance

A novel neural architecture was recently developed that enforces an exac...
โˆ™ 05/27/2021

Stein's Method for Probability Distributions on ๐•Š^1

In this paper, we propose a modification to the density approach to Stei...