k-Variance: A Clustered Notion of Variance

12/13/2020

∙

We introduce k-variance, a generalization of variance built on the machinery of random bipartite matchings. K-variance measures the expected cost of matching two sets of k samples from a distribution to each other, capturing local rather than global information about a measure as k increases; it is easily approximated stochastically using sampling and linear programming. In addition to defining k-variance and proving its basic properties, we provide in-depth analysis of this quantity in several key cases, including one-dimensional measures, clustered measures, and measures concentrated on low-dimensional subsets of ℝ^n. We conclude with experiments and open problems motivated by this new way to summarize distributional shape.

READ FULL TEXT

k-Variance: A Clustered Notion of Variance

Sign in with Google

Consider DeepAI Pro