Distribution Compression in Near-linear Time

11/15/2021
by   Abhishek Shetty, et al.
0

In distribution compression, one aims to accurately summarize a probability distribution ℙ using a small number of representative points. Near-optimal thinning procedures achieve this goal by sampling n points from a Markov chain and identifying √(n) points with 𝒪(1/√(n)) discrepancy to ℙ. Unfortunately, these algorithms suffer from quadratic or super-quadratic runtime in the sample size n. To address this deficiency, we introduce Compress++, a simple meta-procedure for speeding up any thinning algorithm while suffering at most a factor of 4 in error. When combined with the quadratic-time kernel halving and kernel thinning algorithms of Dwivedi and Mackey (2021), Compress++ delivers √(n) points with 𝒪(√(log n/n)) integration error and better-than-Monte-Carlo maximum mean discrepancy in 𝒪(n log^3 n) time and 𝒪( √(n)log^2 n ) space. Moreover, Compress++ enjoys the same near-linear runtime given any quadratic-time input and reduces the runtime of super-quadratic algorithms by a square-root factor. In our benchmarks with high-dimensional Monte Carlo samples and Markov chains targeting challenging differential equation posteriors, Compress++ matches or nearly matches the accuracy of its input algorithm in orders of magnitude less time.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/12/2021

Kernel Thinning

We introduce kernel thinning, a new procedure for compressing a distribu...
research
04/30/2018

Practical Low-Dimensional Halfspace Range Space Sampling

We develop, analyze, implement, and compare new algorithms for creating ...
research
10/04/2021

Generalized Kernel Thinning

The kernel thinning (KT) algorithm of Dwivedi and Mackey (2021) compress...
research
06/11/2017

On the Sampling Problem for Kernel Quadrature

The standard Kernel Quadrature method for numerical integration with ran...
research
11/20/2017

A local graph rewiring algorithm for sampling spanning trees

We introduce a Markov Chain Monte Carlo algorithm which samples from the...
research
05/22/2017

A Linear-Time Kernel Goodness-of-Fit Test

We propose a novel adaptive test of goodness-of-fit, with computational ...
research
05/12/2022

Faster quantum mixing of Markov chains in non-regular graph with fewer qubits

Sampling from the stationary distribution is one of the fundamental task...

Please sign up or login with your details

Forgot password? Click here to reset