Posterior Contraction Rate of Sparse Latent Feature Models with Application to Proteomics
The Indian buffet process (IBP) and phylogenetic Indian buffet process (pIBP) can be used as prior models to infer latent features in a data set. The theoretical properties of these models are under-explored, however, especially in high dimensional settings. In this paper, we show that under mild sparsity condition, the posterior distribution of the latent feature matrix, generated via IBP or pIBP priors, converges to the true latent feature matrix asymptotically. We derive the posterior convergence rate, referred to as the contraction rate. We show that the convergence holds even when the dimensionality of the latent feature matrix increases with the sample size, therefore making the posterior inference valid in high dimensional setting. We demonstrate the theoretical results using computer simulation, in which the parallel-tempering Markov chain Monte Carlo method is applied to overcome computational hurdles. The practical utility of the derived properties is demonstrated by inferring the latent features in a reverse phase protein arrays (RPPA) dataset under the IBP prior model. Software and dataset reported in the manuscript are provided at http://www.compgenome.org/IBP.
READ FULL TEXT