The computational asymptotics of Gaussian variational inference

04/13/2021
by   Zuheng Xu, et al.
0

Variational inference is a popular alternative to Markov chain Monte Carlo methods that constructs a Bayesian posterior approximation by minimizing a discrepancy to the true posterior within a pre-specified family. This converts Bayesian inference into an optimization problem, enabling the use of simple and scalable stochastic optimization algorithms. However, a key limitation of variational inference is that the optimal approximation is typically not tractable to compute; even in simple settings the problem is nonconvex. Thus, recently developed statistical guarantees – which all involve the (data) asymptotic properties of the optimal variational distribution – are not reliably obtained in practice. In this work, we provide two major contributions: a theoretical analysis of the asymptotic convexity properties of variational inference in the popular setting with a Gaussian family; and consistent stochastic variational inference (CSVI), an algorithm that exploits these properties to find the optimal approximation in the asymptotic regime. CSVI consists of a tractable initialization procedure that finds the local basin of the optimal solution, and a scaled gradient descent algorithm that stays locally confined to that basin. Experiments on nonconvex synthetic and real-data examples show that compared with standard stochastic gradient descent, CSVI improves the likelihood of obtaining the globally optimal posterior approximation.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset