The Sparse Hausdorff Moment Problem, with Application to Topic Models

07/16/2020
by   Spencer Gordon, et al.
0

We consider the problem of identifying, from its first m noisy moments, a probability distribution on [0,1] of support k<∞. This is equivalent to the problem of learning a distribution on m observable binary random variables X_1,X_2,…,X_m that are iid conditional on a hidden random variable U taking values in {1,2,…,k}. Our focus is on accomplishing this with m=2k, which is the minimum m for which verifying that the source is a k-mixture is possible (even with exact statistics). This problem, so simply stated, is quite useful: e.g., by a known reduction, any algorithm for it lifts to an algorithm for learning pure topic models. In past work on this and also the more general mixture-of-products problem (X_i independent conditional on U, but not necessarily iid), a barrier at m^O(k^2) on the sample complexity and/or runtime of the algorithm was reached. We improve this substantially. We show it suffices to use a sample of size (klog k) (with m=2k). It is known that the sample complexity of any solution to the identification problem must be (Ω(k)). Stated in terms of the moment problem, it suffices to know the moments to additive accuracy (-klog k). Our run-time for the moment problem is only O(k^2+o(1)) arithmetic operations.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/26/2022

Efficient Algorithms for Sparse Moment Problems without Separation

We consider the sparse moment problem of learning a k-spike mixture in h...
research
11/09/2012

Efficient learning of simplices

We show an efficient algorithm for the following problem: Given uniforml...
research
06/13/2018

Optimal moment inequalities for order statistics from nonnegative random variables

We obtain the best possible upper bounds for the moments of a single ord...
research
01/22/2023

Moment Varieties for Mixtures of Products

The setting of this article is nonparametric algebraic statistics. We st...
research
12/11/2018

Predictive Learning on Sign-Valued Hidden Markov Trees

We provide high-probability sample complexity guarantees for exact struc...
research
12/11/2018

Predictive Learning on Hidden Tree-Structured Ising Models

We provide high-probability sample complexity guarantees for exact struc...
research
12/29/2020

Source Identification for Mixtures of Product Distributions

We give an algorithm for source identification of a mixture of k product...

Please sign up or login with your details

Forgot password? Click here to reset