In this paper, we consider the problem of estimating an unknown image from an observation , related to by a statistical model . We focus on linear problems of the form
where is a known observation operator and is a realization of random noise with bounded energy (i.e., we assume that with known, and being the usual Euclidean norm). Assuming the exact noise model is unknown, we simply postulate a uniform likelihood , where denotes the ball centred in with radius , and where, for every , the function if , and otherwise.222The likelihood can also be used as an approximation in cases where holds with high probability. See Section 4 for more details. This likelihood is commonly used in computational imaging, for example in astronomical imaging [47, 63] and medical imaging . Section 5.2 explains how to generalise the methodology proposed in this paper to other noise models. This generalisation is straightforward; however, for presentation clarity and conciseness here we use the model (1.1).
In imaging sciences, the problem of estimating from is often ill-posed or ill-conditioned, resulting in significant uncertainty about the true value of  (this arises for example in compressive sensing problems where the dimensions ). Bayesian imaging methods address this difficulty by using prior knowledge about to regularise the estimation problem and reduce the uncertainty about . Formally, they model
as a random vector with prior distribution
promoting expected properties (e.g., sparsity or smoothness), and combine observed and prior information by using Bayes’ theorem to produce the posterior distribution
which models our knowledge about after observing . 333Notice that we use generic Bayesian notation. We use for all density functions, and use conditioning to implicitly distinguish between random variables and their realization.
for all density functions, and use conditioning to implicitly distinguish between random variables and their realization.
Bayesian methods have been successfully applied to a wide range of imaging problems, including for example image denoising , inpainting , deblurring , fusion , unmixing , tomographic reconstruction , compressive sensing sparse regression , and segmentation . Solutions can then be computed by using advanced stochastic simulation and optimisation algorithms, as well as deterministic algorithms related to variational Bayes and message passing approximations [51, 53]. Moreover, log-concave formulations have also received a lot of attention lately because they lead to solutions that can be efficiently computed by using modern convex optimisation methods .
In addition to the chosen log-concave uniform likelihood, in a manner akin to , here we assume that the prior distribution of is log-concave, and that the following Assumption holds, where denotes the set of lower semi-continuous, proper, convex functions from to .
The posterior distribution is given by
where , and denotes the indicator function444For a closed, non-empty, convex subset of , the indicator function of at a point is defined by if , and otherwise. of the ball .
For example, in many imaging problems is of the form
where is the regularization parameter, is an analysis operator, typically corresponds to an norm () promoting regularity or sparsity in the domain induced by , and is a closed non-empty convex subset of encoding constraints on the solution space. Observe that (1.4) encompasses sparsity aware models developed during the last decade in the compressed sensing framework [34, 18]. In particular, may be related to a differential operator (e.g. the horizontal and vertical gradients defining the total variation (TV) image prior [19, 58], or a possibly redundant wavelet transform .
Once a model has been defined, imaging methods generally solve the image estimation problem by computing a point estimator of . In particular, most modern methods exploit the convexity properties of and use the MAP estimator
which can be computed efficiently using convex optimization techniques [14, 29, 42]. In particular, the so-called proximal optimization methods received a lot of attention, for example forward-backward algorithms [3, 7, 20, 27, 31, 59], and primal-dual algorithms [1, 11, 16, 21, 32, 30, 37, 42, 61].
Summarising with a single point has the key advantage of producing a solution that can be easily displayed and visually analysed. However, a main limitation of this approach is that it does not provide any information regarding the uncertainty in the solution delivered . As explained previously, quantifying this uncertainty is important in many applications related to quantitative imaging, scientific inquiry, and image-driven decision-making, where it is necessary to analyse images as high-dimensional physical measurements and not as pictures. This analysis is particularly important in imaging problems that are ill-posed or ill-conditioned because of their high intrinsic uncertainty. For illustration, Fig. 1(a) shows an estimate of the W28 supernova, obtained from the under-sampled Fourier measurements of Fig. 1 (b), with , by using a Bayesian model tailored to radio-astronomical imaging . Clearly, the estimation problem is challenging given the severe under-sampling. For this specific imaging setup, what is the uncertainty involved in the estimate ? In particular, are we confident about the different structures observed in ? We expect the main structures to be reliably recovered, but is this also true for the structures of weak amplitude in the background? Perhaps they are reconstruction artefacts.
The objective of this paper is contribute statistical imaging methodology to probe the data and investigate this type of questions. The proposed method, namely BUQO (Bayesian Uncertainty Quantification by Optimization), consists in quantifying the uncertainty of the structures under scrutiny by performing a Bayesian hypothesis test. This test consists of two steps: firstly, it postulates that the structures are not present in the true image, and secondly the data and prior knowledge are used to determine if this null hypothesis is rejected with high probability. Computing such tests for imaging problems is often intractable due to the high dimensionality involved. In this work, we propose to leverage probability concentration phenomena and the underlying convex geometry to formulate the Bayesian hypothesis test as a convex problem. The resulting problem can then be solved efficiently by using scalable optimization algorithms. This allows scaling to high-resolution and high-sensitivity imaging problems that are computationally unaffordable for other Bayesian computation approaches. To illustrate the proposed BUQO methodology, we apply it to a range of challenging Fourier imaging problems arising in astronomy and medicine.
The remainder of the paper is organized as follows. Section 2, introduces the Bayesian uncertainty quantification framework that underpins our work. The proposed methodology is presented in Section 3. Section 4 illustrates the method on two challenging Fourier imaging problems related to radio astronomy and magnetic resonance imaging. Section 5 is a discussion of the proposed methodology. Conclusions and perspectives for future work are finally reported in Section 6.
2 Imaging and Bayesian uncertainty quantification
The Bayesian paradigm provides a powerful methodological framework to analyse uncertainty in imaging inverse problems. One main approach, adopted in [49, 17], is to compute confidence or credible regions that indicate where takes values with high probability. This allows testing if specific images belong to the set of likely solutions and making some preliminary analyses. However, its capacity for formal uncertainty quantification is very limited.
To properly assess the degree of confidence in specific image structures it is necessary to perform a Bayesian hypothesis test. Formally, we postulate two hypotheses:
These hypotheses split the image space in two regions: a set associated with containing all the images (i.e. solutions) without the structure of interest, and the complement associated with . The goal of the hypothesis test is then to determine if the observed data supports or ; that is, if it supports the claim that the estimated structure is real or corresponds to a reconstruction artefact. This is formalized by using Bayesian decision theory , a statistical framework for decision-making under uncertainty. Precisely, from Bayesian decision theory, we reject in favour of with significance level if
or equivalently, if the ratio of posterior probabilities
where we recall that rejecting means that the structure considered is real (i.e. not an artefact).
Unfortunately, computing hypothesis tests for images requires calculating probabilities w.r.t. , which are generally intractable because of the high-dimensionality involved. These probabilities can be approximated with high accuracy by Monte Carlo integration 
(for example by using the state-of-the-art proximal Markov chain Monte Carlo (MCMC) algorithm[35, 48]). Nevertheless, the computational cost associated with these methods is often several orders of magnitude higher than that involved in computing the MAP estimator by convex optimization , which will be discussed later in Section 5.3, in the context of our simulations. Consequently, most of the imaging methods used in practice do not quantify uncertainty.
3 Proposed BUQO method
3.1 Uncertainty quantification approach
A main contribution of this paper is to exploit the log-concavity of to formulate the hypothesis test (2.1) as a convex program that can be solved straightforwardly by using modern convex optimization algorithms when is a convex set. The proposed method only assumes knowledge of the MAP estimator , and does not require computing probabilities. We first introduce the convex program associated with (2.1), then describe the proposed convex optimization algorithm used to solve it, and subsequently present our approach to specify the set associated with . In the remainder of the paper, we make the following assumption on .
The subset of is convex.
The proposed method solves the hypothesis test by comparing with the region of the solution space where most of the posterior probability mass of lies. Such regions are known as posterior credible sets in the Bayesian literature . Precisely, a set is a posterior credible region with confidence level if for . Computing credible regions exactly is difficult because it requires calculating probabilities w.r.t. , which is too computationally expensive when the dimension of is large. Here we take advantage of the conservative credible region recently proposed in , which is available for free in problems solved by MAP estimation. Precisely, for any , we use the region
where the threshold with and is the MAP estimator (1.5) such that .
The set is a conservative Bayesian confidence region for ; i.e., . Observe that, in addition to being computationally straightforward, is also a convex set because is log-concave and has convex superlevel sets. This property will play a central role in our algorithm to compute the hypothesis test. Also note that the highest-posterior-density region , with chosen such that , is the tightest credibility region in the sense of compactness or minimum volume . It is also a convex set as it corresponds to the sublevel set of a convex function. The approximate credibility region defined in (3.1) results from an analytical approximation , with , that can be obtained by leveraging the concentration of measure phenomenon. The set is the tightest approximation of that can be obtained from the knowledge of the MAP estimate (which is computed by convex optimisation) . It also follows from its definition that is convex.
Proof. If , then we have , which implies that . In addition, according to [49, Theorem 3.1.], for any , we have , hence concluding the proof.
The converse of Theorem 3.2 is not true; i.e., does not imply . It is possible that with arbitrarily small. Hence, when we fail to reject the null hypothesis .
From Theorem 3.2, we can verify if by solving the following problem:
There are two possible outcomes: either or . If for a small value of , we conclude that there is strong evidence for the structure considered. Moreover, in that case we also compute the distance between and ,
We will later discuss using this distance to quantify the uncertainty in the intensity of the structure considered (precisely, to lower bound the structure’s intensity).
If we determine that , this suggests that the evidence for the structure under scrutiny is weak. In particular, that the structure is not present in all of the images that considers likely solutions to our inverse problem. Following on from this, to produce an example of such solution we solve the feasibility problem
We view as a counter-example solution where the structure of interest does not exist.
Furthermore, we propose to rely on the von Neumann algorithm [60, 41, 23, 15] to solve problem (3.2)-(3.4). This POCS algorithm alternates Euclidean projections onto the set and the set . Formally, the Euclidean projection of onto is
The main iterations of the von Neumann method are described in Algorithm 1.
The following convergence result from [4, Thm. 4.8] allows to determine if the intersection between and is empty or not.
Theorem 3.4 (Thm. 4.8 in )
3.2 Choice of the set
We are now in a position to present our approach to construct the set . This construction should be intuitive, easy to interpret, and sufficiently flexible to accommodate a broad range of scenarios. Also, it should guarantee that is convex (Assumption 3.1) and hence that the non-feasibility condition is easy to evaluate.
We define as the intersection of convex sets related to different properties that we wish to encode in the test, i.e.,
It is important to emphasize that the projection onto the set , as defined above, may not have a closed form expression. In this case it is necessary to adopt a sub-iterative approach, for example by using a best-approximation method (e.g. Dykstra’s algorithm, see [4, 6] for details). Similarly, when the sets are sophisticated, then primal-dual methods can be used . We illustrate the definition of with the following two examples that will be also relevant for the experiments that we report in Section 4. The first example is related to spatially localized image structures appearing in , such as a tumour in a medical image. The second example is related to background removal; this is useful for instance to assess low-intensity sources appearing in the background of astronomical images. Before giving the particular definitions associated with the localized structures and the background, we need to introduce an additional image defined such that corresponds to the structure of interest, as it appears in the MAP estimate (formal definitions are given for the two particular types of structures defined below). In addition, we introduce the operator selecting the structure of interest. For an image , denotes either the region of the spatially localized structure (in Definition 3.5), or the background (in Definition 3.6). In both the cases, denotes the complementary operator of .
Definition 3.5 (Spatially localized structures)
To assess the confidence in a structure localized in a region of the image , we construct by using an inpainting technique that fills the pixels with the information in the other image pixels . To ensure that is convex we define as a positive linear operator, and allow deviations from this linear inpainting by as much as per pixel, for some tolerance value . This inpainting could potentially amplify the energy in and lead to artificial structures. To prevent this we enforce that where is a reference background level for and controls the energy in . Formally, we use (3.6) with given by
In this case, we define such that and . In addition, and are chosen such that belongs to .
Definition 3.6 (Background removal)
To assess the confidence in low-intensity structures appearing in the background (e.g., determine if they exist or if they are artefacts due to the reconstruction process), we use (3.6) with given by
where are tolerance parameters on the reference background level for . In this case, we define such that and .
To conclude, we now discuss our approach for using the distance between and to bound the intensity of the structure of interest. Recall that is a by-product of Algorithm 1. To relate this quantity to the structure’s intensity we define the normalised intensity of the structure as the ratio between and the intensity of the structure present in the MAP, given by :
Notice that is equivalent to . Consequently when , we conclude that and is not rejected. In the case when , we conclude that is rejected and the value of corresponds to the energy percentage of the structure that is confirmed in the MAP estimate.
3.3 Illustration example
In this section, we provide a simulation example to illustrate the application of the proposed approach for uncertainty quantification. We consider the hypothesis test described in Section 2, with significance . We focus on the example in radio-astronomical imaging described in Fig. 1. We propose to quantify the uncertainty of the spatially localized structure appearing on the left of the image. More precisely, the compact source of interest is highlighted in red on the MAP estimate , in the top-left image of Fig. 3. Mathematical details for the definition of the set are given in Section 4.2.1.
The two resulting images and generated by Algorithm 1 are provided on the top-center and top-right images of Fig. 3, respectively. On the one hand, it can be visually observed that and are very similar. The structure is neither visible in , nor in . This similarity is highlighted on the bottom-center and right images of Fig. 3, corresponding to the images and zoomed in the area of interest. On the other hand, this visual observation is confirmed by the value of . For this example, the structure’s confirmed intensity percentage is equal to . Even if this value is not zero, we consider that we have due to the numerical approximations involved (see Section 4 for details). Consequently, we conclude that , is not rejected, and the evidence for the structure is weak.
The uncertainty quantification conclusions drawn above are characterized by the simulation parameters . It is reasonable to assume that the uncertainty should decrease if either increases, or decreases. For the sake of the illustration, we now investigate the Bayesian uncertainty of the same compact source, but we consider the case when . Note that other cases will be provided in Section 4. Results for this second case are provided in Fig. 4. The images and generated by Algorithm 1 are provided on the top-center and top-right images of Fig. 4, respectively. It can be visually observed that and are different: the structure is visible in , while it is not visible in . This difference is highlighted on the bottom-center and right images of Fig. 4, corresponding to the images and zoomed in the area of interest. The visual observation is confirmed by the value of . Consequently, we can conclude that , and, according to Theorem 3.2, is rejected with significance (recall that rejecting is equivalent to stating that the structure considered is real, not an artefact).
3.4 Implementation details
We now discuss implementation strategies for the proposed methodology. In particular, Algorithm 1 requires computing the projections onto and , which may need to be sub-iterative depending on the structure of these sets. While there are several methods to compute the projection onto convex sets, here we choose to use primal-dual approaches (see e.g. [7, 13, 28, 30, 32, 42, 61]).
3.4.1 Projection onto
We focus on the case where is the hybrid regularization given in (1.4), where and are chosen such that the projections onto the sets , for any , and have closed form expressions. Accordingly, the MAP estimator is given by
In this case, the approximated confidence region is given by
with, for every ,
The minimization problem described in (3.12)-(3.13) involves sophisticated constraints with linear operators. These constraints can be handled efficiently using primal-dual methods such as those developed in [1, 28, 30, 32, 61]. For an overview on primal-dual approaches, we refer the reader to . In particular, we propose to solve problem (3.12)-(3.13) using the primal-dual forward-backward algorithm developed in [32, 61] given in Algorithm 2 below.
Since is convex, for every , the function is strictly convex. Therefore, according to [32, 61], the sequence generated by Algorithm 2 is ensured to converge to the unique minimizer of (i.e. the point of the closest to ).
As particular cases for the function , we can mention the -norm, the -norm, the -norm, the -norm, and the negative logarithm function. The projections onto the lower level sets of these functions can be found on the online proximity operator repository .
3.4.2 Projection onto : Definition 3.5
where , we have
This problem does not have a closed form solution, and hence needs to be solved by computing sub-iterations. Again, here we use a primal-dual forward-backward algorithm [32, 61]. The resulting method is described in Algorithm 3.
3.4.3 Projection onto : Definition 3.6
3.5 Scalable and approximated alternating projection methods
The use of the POCS method given in Algorithm 1 to solve problem (3.4) is important to illustrate the proposed uncertainty quantification approach. However, it is worth mentioning that the convergence of this algorithm can be slow in practice and the convergence results (see Theorem 3.4) hold only if the projections are computed exactly.
There are multiple (possibly accelerated) methods in the literature to solve convex feasibility problems such as (3.4) (see [5, 33, 36] for details). However, our method not only requires to solve (3.4), but also necessitate to determine if this problem is feasible or not, i.e. if the intersection between and is empty or not. Due to that particular subtlety, accelerated POCS methods cannot be used in our approach, since they all assume that the problem of interest must be feasible.
Because holds if and only if , we could also formulate (3.2) as follows:
This problem is strictly convex on and can be solved using recent convex optimization techniques, e.g. the forward-backward (FB) algorithm [59, 31, 3] or its accelerated versions (e.g. [7, 26, 46]). Applied to problem (3.19), the classical FB method can be seen as an alternating projection approach, and takes the form of Algorithm 4.
The sequence generated by Algorithm 4 converges to the unique solution to problem (3.19). The convergence of this algorithm is also guaranteed when projections are computed approximately (with additive errors  or relative errors ). Notice that the POCS method given in Algorithm 1 is recovered in the limit case when in Algorithm 4, which actually provides some notion of robustness of Algorithm 1 to approximation errors.
As for the POCS method given in Algorithm 1, the projections onto the sets and , appearing in steps 3 and 4 respectively, may require sub-iterations (see Section 3.4 for implementation details). To avoid these sub-iterations, it is possible to use more advanced techniques such as the primal-dual algorithm used in Section 3.4 (see for example [30, 32, 42, 52, 61]).
3.6 BUQO in a nutshell
In this section, we summarize the principle of the proposed method. BUQO for computational imaging consists of four main steps, described below:
Identify the structure of interest in .
Define the hypothesis test, by postulating the null hypothesis , i.e. the structure of interest is absent in the true image (see Section 2 for the details).
Determine if using Algorithm 1. In this algorithm,
Step 3 corresponds to the projection onto . The computation of this projection is detailed in Section 3.4.1.
Deduce if is rejected using Theorem 3.2.
If , then is rejected with significance , and the structure of interest is present in the true image with probability .
If , then cannot be rejected, and the presence of the structure of interest in the true image is uncertain.
4 Simulation results
In this section we apply the proposed uncertainty quantification approach to Fourier imaging applications in radio astronomy (Section 4.2) and magnetic resonance in medicine (Section 4.3). We refer the reader to Section 3.3 for an illustration example, where a step-by-step explanation is given for the practical application of the proposed uncertainty quantification approach.
Before giving the uncertainty quantification results obtained using the proposed approach, we describe in Section 4.1 the common simulation settings.
4.1 Simulation settings
corresponds to the Daubechies wavelet Db8. As far as the additive noise is considered, it is generated as i.i.d. Gaussian noise with variance. We recall that our approach assumes no explicit knowledge of the noise distribution, other than the fact that it has bounded energy with bound