Nonparametric Estimation of Repeated Densities with Heterogeneous Sample Sizes

12/18/2020
by   JiaMing Qiu, et al.
0

We consider the estimation of densities in multiple subpopulations, where the available sample size in each subpopulation greatly varies. For example, in epidemiology, different diseases may share similar pathogenic mechanism but differ in their prevalence. Without specifying a parametric form, our proposed approach pools information from the population and estimate the density in each subpopulation in a data-driven fashion. Low-dimensional approximating density families in the form of exponential families are constructed from the principal modes of variation in the log-densities, within which subpopulation densities are then fitted based on likelihood principles and shrinkage. The approximating families increase in their flexibility as the number of components increases and can approximate arbitrary infinite-dimensional densities with discrete observations, for which we derived convergence results. The proposed methods are shown to be interpretable and efficient in simulation as well as applications to electronic medical record and rainfall data.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset