On Model Selection with Summary Statistics

03/28/2018 ∙ by Erlis Ruli, et al. ∙ Università di Padova 0

Recently, many authors have cast doubts on the validity of ABC model choice. It has been shown that the use of sufficient statistic in ABC model selection leads, apart from few exceptional cases in which the sufficient statistic is also cross-model sufficient, to unreliable results. In a single model context and given a sufficient summary statistic, we show that it is possible to fully recover the posterior normalising constant, without using the likelihood function. The idea can be applied, in an approximate way, to more realistic scenarios in which the sufficient statistic is not unavailable but a "good" summary statistic for estimation is available.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Background

Approximate Bayesian Computation (ABC) is a useful tool for Bayesian (see, e.g., marin2012) or frequentist (see, e.g., rubio2013simple) inferences when the likelihood function is mathematically or computationally unavailable. The successfulness of the ABC method relies on a careful choice of: the summary statistics , the distance metric and tolerance level ; with the summary statistic playing arguably the most crucial role.

To set the notation, let be

realisations of the random variable

, with , . Furthermore, let be a prior distribution for , for simplicity assumed to be proper, let be the likelihood function based on model and data y and let be the posterior distribution, with normalising constant

. The Bayes factor (BF), the standard Bayesian solution for model selection, involves the posterior normalising constants of the models under comparison. Thus, if the likelihood for a single model is unavailable, the BF cannot be computed.

The ABC machinery comes equipped with an ABC model choice (ABC-MC) algorithm which works as follows (grelaud2009abc).

Result: A sample of model indices
for  do
       repeat
            1 draw from
            2 draw from
            3 draw from
            
      until ;
      4set
end for
Algorithm 1 ABC model choice (ABC-MC) sampler.

where is the threshold value and is the model index. The

-vector of indices

produced form Algorithm 1

can be used, in principle, to compute posterior model probabilities and BFs.

Recently, many authors have cast doubts on the validity of the ABC model choice procedure (see, e.g., marin2014relevant; robert2011lack). For instance, suppose is a vector of counts and we wish to choose between the Poisson and the Geometric model. In both cases, with ABC we can obtain (almost) the exact posterior by using as the summary statistic, since the latter is sufficient under both models. However, the BF obtained with ABC-MC in this case converges asymptotically to a positive constant (robert2011lack). With the particular exception of Gibbs random fields (grelaud2009abc), the BF obtained with ABC-MC misses the exact BF by some unknown function of the data. marin2014relevant give theoretical conditions under which the summary statistic gives valid posterior model probabilities or BFs under the ABC model choice framework.

Clearly, the issue is with the summary statistic . Even though it can be sufficient for the parameters, it is the cross-model sufficiency that plays the crucial role here, e.g., the summary statistic must be sufficient for the models themselves (see also marin2014relevant). Finding cross-model sufficient statistics in practice is impossible, and some efforts have been spent on constructing summary statistics for ABC model selection (see, e.g., barnes2012). However, at the best of our knowledge, the choice of summary statistics for ABC model selection is still an open problem. Last but not the least, summary statistics for ABC model selection are notoriously a bad choice for ABC posterior sampling (C.P. Robert, personal communication).

In Section 2 we show how the marginal likelihood can be approximated by using the sufficient summary statistic and ABC. In Section 3 we conclude by pointing to future developments.

2 Marginal likelihood from sufficient statistics

Let us focus on a single model , and suppose that is sufficient for . By the sufficiency principle we have that

From this we see that

To approximate we propose to approximate via ABC, and by simulation as follows (see Algorithm 2).

Result: approximation of .
1 use ABC to get a posterior sample and compute its mean ;
2 compute , the ordinate of the posterior at ;
3 draw a large sample of from and;
4 compute an approximation of ;
5 set .
Algorithm 2 Marginal likelihood from summary statistic.

Steps 2 and 4 of Algorithm 2 can be performed by any density estimation method; in Step 5 we only need to generate a (possibly) large sample of data from the model under , a fixed value of .

A toy example: the Poisson model
Suppose , and a priori . The marginal likelihood in case is

(2.1)

As a numerical example, consider , which are realisations of random draws from distribution. Figure 1 shows on the left side the histogram of the ABC posterior against the exact posterior (solid line). The ABC posterior is approximated by final samples with , where is the Euclidean distance among the total number of counts.

Figure 1: Left: ABC (histogram) against the exact posterior (line). Right: scatter plot of the log-marginal likelihoods obtained with ABC versus the exact log-marginal likelihoods in 50 random samples, each of size .

The right side of Figure 1 shows the scatter plot of the logarithm of the marginal likelihood obtained from ABC against the logarithm of the exact marginal likelihood (2.1), in 50 random samples of size from . The approximate marginal likelihoods obtained from ABC with the sufficient statistic and the exact marginal likelihoods are virtually indistinguishable.

3 Conclusion

Obviously, in realistic scenarios sufficient summary statistics are unavailable. However, if we have a set of judiciously chosen summary statistics which give provably valid inference on the parameters of interest, then the idea can still be usefully applied. The more close to sufficiency is the summary statistic the more close to the exact value is the proposed approximation.

Acknowledgement

This work was presented at the BayesComp 2018 conference (26–29 March, Barcelona) and is partially supported by University of Padova (Progetti di Ricerca di Ateneo 2015, CPDA153257) and by PRIN 2015 (grant 2015EASZFS_003).

References