Conditional Variational Inference with Adaptive Truncation for Bayesian Nonparametric Models
The scalable inference for Bayesian nonparametric models with big data is still challenging. Current variational inference methods fail to characterise the correlation structure among latent variables due to the mean-field setting and cannot infer the true posterior dimension because of the universal truncation. To overcome these limitations, we build a general framework to infer Bayesian nonparametric models by maximising the proposed nonparametric evidence lower bound, and then develop a novel approach by combining Monte Carlo sampling and stochastic variational inference framework. Our method has several advantages over the traditional online variational inference method. First, it achieves a smaller divergence between variational distributions and the true posterior by factorising variational distributions under the conditional setting instead of the mean-field setting to capture the correlation pattern. Second, it reduces the risk of underfitting or overfitting by truncating the dimension adaptively rather than using a prespecified truncated dimension for all latent variables. Third, it reduces the computational complexity by approximating the posterior functionally instead of updating the stick-breaking parameters individually. We apply the proposed method on hierarchical Dirichlet process and gamma–Dirichlet process models, two essential Bayesian nonparametric models in topic analysis. The empirical study on three large datasets including arXiv, New York Times and Wikipedia reveals that our proposed method substantially outperforms its competitor in terms of lower perplexity and much clearer topic-words clustering.
READ FULL TEXT