Skewed Bernstein-von Mises theorem and skew-modal approximations
Deterministic Gaussian approximations of intractable posterior distributions are common in Bayesian inference. From an asymptotic perspective, a theoretical justification in regular parametric settings is provided by the Bernstein-von Mises theorem. However, such a limiting behavior may require a large sample size before becoming visible in practice. In fact, in situations with small-to-moderate sample size, even simple parametric models often yield posterior distributions which are far from resembling a Gaussian shape, mainly due to skewness. In this article, we provide rigorous theoretical arguments for such a behavior by deriving a novel limiting law that coincides with a closed-form and tractable skewed generalization of Gaussian densities, and yields a total variation distance from the exact posterior whose convergence rate crucially improves by a √(n) factor the one obtained under the classical Bernstein-von Mises theorem based on limiting Gaussians. In contrast to higher-order approximations (which require finite truncations for inference, possibly leading to even negative densities), our theory characterizes the limiting behavior of Bayesian posteriors with respect to a sequence of valid and tractable densities. This further motivates a practical plug-in version which replaces the unknown model parameters with the corresponding MAP estimate to obtain a novel skew-modal approximation achieving the same improved rate of convergence of its population counterpart. Extensive quantitative studies confirm that our new theory closely matches the empirical behavior observed in practice even in finite, possibly small, sample regimes. The proposed skew-modal approximation further exhibits improved accuracy not only relative to classical Laplace approximation, but also with respect to state-of-the-art approximations from mean-field variational Bayes and expectation-propagation.
READ FULL TEXT