Efficient Density Estimation via Piecewise Polynomial Approximation

05/14/2013
by   Siu-On Chan, et al.
0

We give a highly efficient "semi-agnostic" algorithm for learning univariate probability distributions that are well approximated by piecewise polynomial density functions. Let p be an arbitrary distribution over an interval I which is τ-close (in total variation distance) to an unknown probability distribution q that is defined by an unknown partition of I into t intervals and t unknown degree-d polynomials specifying q over each of the intervals. We give an algorithm that draws Õ(t(d+1)/^2) samples from p, runs in time (t,d,1/), and with high probability outputs a piecewise polynomial hypothesis distribution h that is (O(τ)+)-close (in total variation distance) to p. This sample complexity is essentially optimal; we show that even for τ=0, any algorithm that learns an unknown t-piecewise degree-d probability distribution over I to accuracy must use Ω(t(d+1)/(1 + (d+1))·1/^2) samples from the distribution, regardless of its running time. Our algorithm combines tools from approximation theory, uniform convergence, linear programming, and dynamic programming. We apply this general algorithm to obtain a wide range of results for many natural problems in density estimation over both continuous and discrete domains. These include state-of-the-art results for learning mixtures of log-concave distributions; mixtures of t-modal distributions; mixtures of Monotone Hazard Rate distributions; mixtures of Poisson Binomial Distributions; mixtures of Gaussians; and mixtures of k-monotone densities. Our general technique yields computationally efficient algorithms for all these problems, in many cases with provably optimal sample complexities (up to logarithmic factors) in all parameters.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/09/2018

Density estimation for shift-invariant multidimensional distributions

We study density estimation for classes of shift-invariant distributions...
research
02/25/2020

A General Method for Robust Learning from Batches

In many applications, data is collected in batches, some of which are co...
research
02/22/2020

SURF: A Simple, Universal, Robust, Fast Distribution Learning Algorithm

Sample- and computationally-efficient distribution estimation is a funda...
research
12/16/2019

Learning Mixtures of Linear Regressions in Subexponential Time via Fourier Moments

We consider the problem of learning a mixture of linear regressions (MLR...
research
02/10/2019

The Optimal Approximation Factor in Density Estimation

Consider the following problem: given two arbitrary densities q_1,q_2 an...
research
09/22/2020

An adaptive transport framework for joint and conditional density estimation

We propose a general framework to robustly characterize joint and condit...
research
02/20/2021

Efficient Learning of Non-Interacting Fermion Distributions

We give an efficient classical algorithm that recovers the distribution ...

Please sign up or login with your details

Forgot password? Click here to reset