Density Sketches for Sampling and Estimation

02/24/2021
by   Aditya Desai, et al.
5

We introduce Density sketches (DS): a succinct online summary of the data distribution. DS can accurately estimate point wise probability density. Interestingly, DS also provides a capability to sample unseen novel data from the underlying data distribution. Thus, analogous to popular generative models, DS allows us to succinctly replace the real-data in almost all machine learning pipelines with synthetic examples drawn from the same distribution as the original data. However, unlike generative models, which do not have any statistical guarantees, DS leads to theoretically sound asymptotically converging consistent estimators of the underlying density function. Density sketches also have many appealing properties making them ideal for large-scale distributed applications. DS construction is an online algorithm. The sketches are additive, i.e., the sum of two sketches is the sketch of the combined data. These properties allow data to be collected from distributed sources, compressed into a density sketch, efficiently transmitted in the sketch form to a central server, merged, and re-sampled into a synthetic database for modeling applications. Thus, density sketches can potentially revolutionize how we store, communicate, and distribute data.

READ FULL TEXT
research
01/08/2020

Learning Generative Models using Denoising Density Estimators

Learning generative probabilistic models that can estimate the continuou...
research
04/20/2020

Roundtrip: A Deep Generative Neural Density Estimator

Density estimation is a fundamental problem in both statistics and machi...
research
06/21/2022

Lyapunov Density Models: Constraining Distribution Shift in Learning-Based Control

Learned models and policies can generalize effectively when evaluated wi...
research
08/22/2023

Convergence guarantee for consistency models

We provide the first convergence guarantees for the Consistency Models (...
research
02/07/2020

Learning Implicit Generative Models with Theoretical Guarantees

We propose a unified framework for implicit generative modeling (UnifiGe...
research
08/03/2018

Robust Regression for Automatic Fusion Plasma Analysis based on Generative Modeling

The first step to realize automatic experimental data analysis for fusio...
research
06/11/2022

Sampling-based Estimation of the Number of Distinct Values in Distributed Environment

In data mining, estimating the number of distinct values (NDV) is a fund...

Please sign up or login with your details

Forgot password? Click here to reset