Histogram Meets Topic Model: Density Estimation by Mixture of Histograms

12/25/2015
by   Hideaki Kim, et al.
0

The histogram method is a powerful non-parametric approach for estimating the probability density function of a continuous variable. But the construction of a histogram, compared to the parametric approaches, demands a large number of observations to capture the underlying density function. Thus it is not suitable for analyzing a sparse data set, a collection of units with a small size of data. In this paper, by employing the probabilistic topic model, we develop a novel Bayesian approach to alleviating the sparsity problem in the conventional histogram estimation. Our method estimates a unit's density function as a mixture of basis histograms, in which the number of bins for each basis, as well as their heights, is determined automatically. The estimation procedure is performed by using the fast and easy-to-implement collapsed Gibbs sampling. We apply the proposed method to synthetic data, showing that it performs well.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/11/2018

From Adaptive Kernel Density Estimation to Sparse Mixture Models

We introduce a balloon estimator in a generalized expectation-maximizati...
research
01/14/2014

Binary Classifier Calibration: Non-parametric approach

Accurate calibration of probabilistic predictive models learned is criti...
research
08/02/2018

Histogram Transform-based Speaker Identification

A novel text-independent speaker identification (SI) method is proposed....
research
12/27/2022

Fast and fully-automated histograms for large-scale data sets

G-Enum histograms are a new fast and fully automated method for irregula...
research
12/08/2020

Bayesian Inference for Polycrystalline Materials

Polycrystalline materials, such as metals, are comprised of heterogeneou...
research
02/10/2022

Multiclass histogram-based thresholding using kernel density estimation and scale-space representations

We present a new method for multiclass thresholding of a histogram which...
research
06/09/2023

Two-level histograms for dealing with outliers and heavy tail distributions

Histograms are among the most popular methods used in exploratory analys...

Please sign up or login with your details

Forgot password? Click here to reset