I Introduction
Digital images are subject to a wide variety of degradations, which in most cases can be modeled as
(1) 
where is the observation, is the degradation operator, is the underlying groundtruth image and is additive noise. Different settings of the degradation matrix model different problems such as zooming, deblurring or missing pixels. Different versions of the noise term
include the classical additive Gaussian noise with constant variance or more complicated and realistic models such as signal dependent noise.
Due to the inherent illposedness of such inverse problems, standard approaches impose some prior on the image, in either variational or Bayesian approaches. Popular image models have been proposed through the total variation [1], wavelet decompositions [2] or the sparsity of image patches [3]. Buades et al. [4] introduced the use of patches and the selfsimilarity hypothesis to the denoising problem leading to a new era of patchbased image restoration techniques. A major step forward in fully exploiting the potential of patches was achieved by several stateoftheart restoration methods with the introduction of patch prior models, in a Bayesian framework. Some methods are devoted to the denoising problem [5, 6, 7, 8], while others propose a more general framework for the solution of image inverse problems [9, 10], including inpainting, deblurring and zooming. The work by Lebrun et al. [11, 7] presents a thorough analysis of several recent restoration methods, revealing their common roots and their relationship with the Bayesian approach.
Among the stateoftheart restoration methods, two noticeable approaches are the patchbased Bayesian approach by Yu et al. [10], namely the piecewise linear estimators (PLE), and the nonlocal Bayes (NLB) algorithm by Lebrun et al. [7]. PLE is a general framework for the solution of image inverse problems under Model (1), while NLB is a denoising method (). Both methods use a Gaussian patch prior learnt from image patches through iterative procedures. In the case of PLE, patches are modeled according to a Gaussian Mixture Model (GMM), with a relatively small number of classes (19 in all their experiments), whose parameters are learnt from all image patches^{1}^{1}1Actually, the authors report the use of image subregions in their experiments, so we may consider PLE as a semilocal approach.. In the case of NLB, each patch is associated with a single Gaussian model, whose parameters are computed from similar patches chosen from a local neighbourhood, i.e., a search window centered at the patch. We refer hereafter to this kind of perpatch modelling as local.
Zoran and Weiss [9] (EPLL) follow a similar approach, but instead of iteratively updating the GMM from image patches, they use a larger number of classes that are fixed and learnt from a large database of natural image patches. Wang and Morel [8] claim that, in the case of denoising, it is better to have fewer models that are updated with the image patches (as in PLE) than having a large number of fixed models (as in EPLL).
All of the previous restoration approaches share a common Bayesian framework based on Gaussian patch priors. Relying on local priors [7, 8] has proven more accurate for the task of image denoising than relying on a mixture of a limited number of Gaussian models [9, 10]. In particular, NLB outperforms PLE for this task [12], mostly due to its local model estimation. On the other hand, PLE yields stateoftheart results in other applications such as interpolation of missing pixels (especially with high masking rates), deblurring and zooming.
As a consequence we are interested in taking advantage of a local patch modelling for more general inverse problems than denoising. The main difficulty lies in the estimation of the models, especially when the image degradations involve a high rate of missing pixels, in which case the estimation is seriously illposed.
In this work we propose to model image patches according to a Gaussian prior, whose parameters, the mean and the covariance matrix
, will be estimated locally from similar patches. In order to tackle this problem, we include prior knowledge on the model parameters making use of a hyperprior, i.e. a probability distribution on the parameters of the prior. In Bayesian statistics,
andare known as hyperparameters, while the prior on them is called a hyperprior. Such a framework is often called hierarchical Bayesian modelling
[13]. Its application to inverse problems in imaging is not new. In particular, in the field of image restoration, this methodology was proposed by Molina et al. [14, 15], and was more recently applied to image unmixing problems [16] and to image deconvolution and the estimation of the point spread function of a camera [17]. However, to our knowledge, this is the first time that such a hierarchical Bayesian methodology is used to reduce illposedness in patchbased image restoration. In this context, the use of a hyperprior compensates for the patches missing information.There are two main contributions of this work: First, as described above, we propose a robust framework enabling the use of Gaussian local priors on image patches for solving a useful family of restoration problems by drawing on a hierarchical Bayesian approach. The second advantage of the proposed framework is its ability to deal with signal dependent noise, therefore making it adapted to realistic digital photography applications.
Experiments on both synthetic and real data show that the approach is well suited to various problems involving a diagonal degradation operator. First, we show stateoftheart results in image restoration problems such as denoising, zooming and interpolation of missing pixels. Then we consider the generation of high dynamic range (HDR) images from a single snapshot using spatially varying pixel exposures [18] and demonstrate that our approach significantly outperforms existing methods to deal with this inverse problem. It is worth mentioning that modified sensors enabling such approaches have been recently made available by Sony but are not yet fully exploited by available smartphones and digital cameras.
The article is organized as follows. Section II introduces the proposed approach while Section III presents the main implementation aspects. Supportive experiments are presented in Section IV. Section V is devoted to the application of the proposed framework to the HDR imaging problem. Last, conclusions are summarized in Section VI.
Ii Hyperprior Bayesian Estimator
The proposed restoration method, called Hyperprior Bayesian Estimator (HBE), assumes a Gaussian prior for image patches, with parameters and . A joint maximum a posteriori formulation is used to estimate both the image patches and the parameters and , thanks to a Bayesian hyperprior model on these parameters, stabilizing the local estimation of the Gaussian statistics. As a consequence, we can exploit the accuracy of local model estimation for general restoration problems, in particular with missing values (e.g. for interpolation or zooming). Figure 1 illustrates the proposed approach which is described in detail in the following.
Iia Patch degradation model
The observed image is decomposed into overlapping patches of size . Each patch
is considered to be a realization of the random variable
given by(2) 
where is a degradation operator, is the original patch we seek to estimate and
is an additive noise term, modeled by a Gaussian distribution
. Therefore, the distribution of given can be written as(3) 
In this noise model, the matrix is only assumed to be diagonal (the noise is uncorrelated). It can represent a constant variance, spatially variable variances or even variances dependent on the pixel value (to approximate Poisson noise).
This degradation model is deliberately generic. We will see in Section IV that keeping a broad noise model is essential to properly tackle the problem of HDR imaging from a single image. The model also includes the special case of multiplicative noise.
IiB Joint Maximum A Posteriori
We assume a Gaussian prior for each patch, with unknown mean and covariance matrix , To simplify calculations we work with the precision matrix . As it is usual when considering hyperpriors, we assume that the parameters and follow a conjugate distribution. In our case, that boils down to assuming a NormalWishart^{2}^{2}2
The NormalWishart distribution is the conjugate prior of a multivariate normal distribution with unknown mean and covariance matrix.
denotes the Wishart distribution [19]. prior for the couple (),(4)  
with parameters , , scale parameter and degrees of freedom.
Now, assume that we observe a group of similar patches and that we want to recover the restored patches . If these unknown are independent^{3}^{3}3We rely on the classical independence assumption made in the patchbased literature, even if it is wrong in case patches overlap. and follow the same Gaussian model, we can compute the joint maximum a posteriori
(5)  
In this product, the first term is given by the noise model (3), the second one is the Gaussian prior on the set of patches and the third one is the hyperprior (4). In the last equality we omit the explicit dependence on and in , since these parameters are completely determined by the set .
IiC Optimality conditions
Computing the joint maximum a posteriori amounts to minimizing
over the set , with the set of real symmetric positive definite matrices of size .
The function is biconvex respectively in the variables and . To minimize this energy for a given set of hyperparameters , we will use an alternating convex minimization scheme. At each iteration, is first minimized with respect to with fixed, then viceversa.
Differentiating with respect to each variable, we get explicit optimality equations for the minimization scheme. The proofs of the following propositions are straightforward and available in the supplementary material.
Proposition 1.
Assume that is fixed and that the covariance does not depend on the . The function is convex on and attains its minimum at , given by
(6) 
(7) 
with .
Proposition 2.
Assume that the variables are fixed. The function is convex on and attains its minimum at such that
(8) 
The expression of in (7) is obtained under the hypothesis that the noise covariance matrix does not depend on . Under the somewhat weaker hypothesis that the noise and the signal are uncorrelated, this estimator is also the affine estimator that minimizes the Bayes risk (c.f. supplementary material). The uncorrelatedness of and is a reasonable hypothesis in practice. This includes various noise models, such as with independent of , which approximates CMOS and CCD raw data noise [20].
From (6), we find that the MAP estimator of is a weighted average of two terms: the mean estimated from the similar restored patches and the prior . The parameter controls the confidence level we have on the prior . With the same idea, we observe that the MAP estimator for is a combination of the prior on , the covariance imposed by and the covariance matrix estimated from the patches .
IiD Alternating convex minimization of
The previous propositions imply that we can derive an explicit alternating convex minimization scheme for , presented in Algorithm 1. Starting with a given value of , at each step, and are computed according to Equations (6) and (7), then is updated according to (8).
We show in Appendix A the following convergence result for the previous algorithm. The proof adapts the arguments in [21] to our particular case.
Proposition 3.
The sequence converges monotonically when . The sequence generated by the alternate minimization scheme has at least one accumulation point. The set of its accumulation points forms a connected and compact set of partial optima and stationary points of , and they all have the same function value.
In practice, we observe in our experiments that the algorithm always converges after a few iterations.
IiE Full restoration algorithm
The full restoration algorithm used in our experiments is summarized in Algorithm 2 and illustrated by Figure 1. It alternates between two stages: the minimization of using Algorithm 1, and the estimation of the hyperparameters . In order to estimate these parameters, we rely on an oracle image computed by aggregation of all the patches estimated on the first stage (details are provided in Section IIID).
Iii Implementation details
Iiia Search for similar patches
The similar patches are all patches within a search window centered at the current patch, whose distance to the central patch is less than a given threshold. This threshold is given by a tolerance parameter times the distance to the nearest neighbour (the most similar one). In all our experiments, the search window was set to size (with a patch size of ) and . The patch comparison is performed in an oracle image (i.e. the result of the previous iteration), so all pixels are known. However, it may be useful to assign different confidence levels to the known pixels and to those originally missing and then restored. For all the experimental results presented in Section IV, the distance between patches and in the oracle image is computed as
(9) 
where indexes the pixels in the patch, if (known pixel) and otherwise (originally missing then restored pixel) [22]. With this formulation, known pixels are assigned a much higher priority than unknown ones. Variations on these weights could be explored.
IiiB Optional speedup by Collaborative Filtering
The proposed method computes one Gaussian model per image patch according to Equations (6) and (8). In order to reduce the computational cost, we can rely on the collaborative filtering idea previously introduced for patchbased denoising techniques [7, 23]. Based on the hypothesis that similar patches share the same model, we assign the same model to all patches in the set of similar patches (as defined in Section IIIA).
The number of similar patches jointly restored depends on the image and the tolerance parameter , but it is often much smaller than what would result from the patch clustering performed by methods that use global GMMs such as PLE or EPLL. Performance degradation is observed in practice when using a very large tolerance parameter (), showing that mixing more patches than needed is detrimental. The collaborative filtering strategy helps accelerating the algorithm up to a certain point, but a tradeoff with performance needs to be considered.
IiiC Parameter choices
The four parameters of the NormalWishart distribution: , , the prior mean and the prior covariance matrix , must be set in order to compute and .
Choice of and
The computation of according to (6) combines the mean estimated from the similar patches and the prior mean . The parameter is related to the degree of confidence we have on the prior . Hence, its value should be a tradeoff between the confidence we have in the prior accuracy vs. the one we have in the information provided by the similar patches. The latter improves when both (i.e. the number of similar patches) and (i.e. the number of known pixels in the current patch) increase. These intuitive insights suggest the following rule to set :
(10) 
A similar reasoning leads to the same rule for ,
(11) 
where the addition of ensures the condition required by the NormalWishart prior to be verified.
This rule is used to obtain the experimental results presented in Section IV, and proved to be a consistently good choice despite its simplicity. However, setting these parameters in a more general setting is not a trivial task and should be the subject of further study. In particular we could explore a more continuous dependence of on , , and possibly a third term where . This term estimates to what an extent similar patches cover the missing pixels in the current patch.
Setting of and
IiiD Initialization
A good initialization is crucial since we aim at solving a nonconvex problem through an iterative procedure. To initialize the proposed algorithm we follow the approach proposed by Yu et al. [10] (described in detail in Appendix A in the supplementary material). They propose to initialize the PLE algorithm by learning the GMM covariance matrices from synthetic images of edges with different orientations as well as from the DCT basis to represent isotropic patterns. As they state, in dictionary learning, the most prominent atoms represent local edges which are useful to represent and restore contours. Hence, this initialization helps to correctly restore corrupted patches even in quite extreme cases. The oracle of the first iteration of the proposed approach is the output of the first iteration of the PLE algorithm.
IiiE Computational complexity
With the original perpatch strategy, the complexity of the algorithm is given by step 3 in Algorithm 1: , so the total complexity is (where total number of patches to be restored and assuming the Cholesky factorization is used for matrix inversion). The collaborative filtering strategy reduces this value by a factor that depends on the number of groups of similar patches, which depends on the image contents and the distance tolerance parameter . The main difference with the PLE algorithm complexity () is a factor given by the number of groups defined by the collaborative filtering approach and the ratio between and . As mentioned by Yu et al. [10], computational complexity can be further reduced in the case of binary masks by removing the zero rows and inverting a matrix of size instead of where is the masking ratio. Moreover, the proposed algorithm can be run in parallel in different image subregions thus allowing for even further acceleration in multiplecore architectures. The complexity comparison to NLB needs to be made in the case where the degradation is additive noise with constant variance (translation invariant degradation), which is the task performed by NLB. In that case, the complexity of the proposed approach (without considering collaborative filtering nor parallelization, which are both done also in NLB), is whereas that of NLB is .
Iv Image Restoration Experiments
In this section we illustrate the ability of the proposed method to solve several image inverse problems. Both synthetic (i.e., where we have added the degradation artificially) and real data (i.e., issued from a real acquisition process) are used. The considered problems are: interpolation, combined interpolation and denoising, denoising, and zooming. The reported values of peak signaltonoise ratio () are averaged over 10 realizations for each experiment (variance is below 0.1 for interpolation and below 0.05 for combined interpolation and denoising and for denoising only). Similar results are obtained with the structural similarity index (SSIM) which is included in the supplementary material (Appendix B).
Interpolation  PSNR (dB)  Interpolation & Denoising  PSNR (dB)  
50%  70%  90%  70%  90%  
HBE  PLE  EPLL  EPLE  HBE  PLE  EPLL  EPLE  HBE  PLE  EPLL  EPLE  HBE  PLE  EPLL  EPLE  HBE  PLE  EPLL  EPLE  
barbara  39.11  36.93  32.99  35.43  34.69  32.50  27.96  28.77  24.86  23.62  23.30  23.26  33.34  31.99  27.63  27.75  24.57  23.53  23.27  23.20 
boat  34.92  34.32  34.21  33.59  31.37  30.74  30.38  30.26  25.96  25.35  24.72  25.43  30.61  30.41  30.15  29.54  25.78  25.45  24.71  25.47 
traffic  30.17  30.12  30.19  28.86  27.27  27.12  27.13  26.64  22.84  22.34  21.85  22.27  26.99  26.98  27.05  26.35  22.87  22.43  22.21  22.35 
Denoising  PSNR (dB)  Zooming  PSNR (dB)  
10  30  50  80  
HBE  NLB  EPLL  HBE  NLB  EPLL  HBE  NLB  EPLL  HBE  NLB  EPLL  HBE  PLE  EPLL  EPLE  Lanczos  
barbara  41.26  41.20  40.56  38.40  38.26  37.32  37.13  36.94  35.84  35.96  35.73  34.51  38.17  37.11  31.34  36.51  28.01  
boat  40.05  39.99  39.47  36.71  36.76  36.34  35.41  35.46  35.13  34.30  34.33  34.12  32.35  31.96  31.95  32.08  29.60  
traffic  40.73  40.74  40.55  37.03  36.99  36.86  35.32  35.26  35.20  33.78  33.70  33.72  25.05  24.78  25.17  24.91  21.89 
Iva Synthetic degradation
Interpolation
Random masks with 50%, 70% and 90% missing pixels are applied to the tested groundtruth images. The interpolation performance of the proposed method is compared to that of PLE [10], EPLL [9] and EPLE [25] using a patch size of for all methods. PLE parameters are set as indicated in [10] (, , ). We used the EPLL code provided by the authors [24] with default parameters and the EPLE code available in [25] with the parameters set as specified in the companion demo. The parameters for the proposed method are set to , ( and define the values for and , see Section IIIC). The PSNR results are shown in Table I. Figure 2 shows some extracts of the obtained results, the PSNR values for the extracts and the corresponding difference images with respect to the groundtruth. The proposed method gives sharper results than the other considered methods. This is specially noticeable on the reconstruction of the texture of the fabric of Barbara’s trousers shown in the first row of Figure 2 or on the strips that appear through the car’s window shown in the second row of the same figure.
Combined interpolation and denoising
For this experiment, the groundtruth images are corrupted with additive Gaussian noise with variance 10, and a random mask with 70% and 90% of missing pixels. The parameters for all methods are set as in the previous interpolationonly experiment. Table I summarizes the PSNR values obtained by each method. Figure 3 shows some extracts of the obtained results, the PSNR values for the extracts and the corresponding difference images with respect to the groundtruth. Once again, the results show that the proposed approach outperforms the others. Fine structures, such as the mast and the ropes of the ship, as well as textures, as in Barbara’s headscarf, are much better preserved.
Denoising
For the denoising task the proposed approach should perform very similarly to the stateoftheart denoising algorithm NLB [7]. The following experiments are conducted in order to verify this.
The groundtruth images are corrupted with additive Gaussian noise with variance . The code provided by the authors [26] automatically sets the NLB parameters from the input and the patch size, in this case . For this experiment, there are no unknown pixels to interpolate (the mask
is the identity matrix).
The results of both methods are very similar if HBE is initialized with the output of the first step of NLB [7] (instead of using the initialization described in Section IIE) and the parameters and are large enough. In this case, and are prioritized in equations (6) and (8) and both algorithms are almost the same. That is what we observe in practice with , as demonstrated in the results summarized in Table I. The denoising performance of HBE is degraded for small and values. This is due to the fact that and , as well as and in NLB, are computed from an oracle image resulting from the first restoration step. This restoration includes not only the denoising of each patch, but also an aggregation step that greatly improves the final result. Therefore, the contribution of the first term of (6) to the computation of degrades the result compared to that of using only (i.e. using a large ).
Zooming
In order to evaluate the zooming capacity of the proposed approach, groundtruth images are downsampled by a factor 2 (no antialiasing filter is used) and the zooming is compared to the groundtruth. The results are compared with PLE, EPLL, EPLE and Lanczos interpolation. Table I summarizes the obtained PSNR values. Figure 4 shows extracts from the obtained results, the PSNR values for the extracts and the corresponding difference images with respect to the groundtruth. Again, HBE yields a sharper reconstruction than the other methods.
IvB Real data
For this experiment, we use raw images captured with a Canon 400D camera (set to ISO 400 and exposure time 1/160 seconds). The main noise sources for CMOS sensors are: the Poisson photon shot noise, which can be approximated by a Gaussian distribution with equal mean and variance; the thermally generated readout noise, which is modeled as an additive Gaussian distributed noise and the spatially varying gain given by the photo response non uniformity (prnu) [20, 27]. We thus consider the following noise model for the non saturated raw pixel value at position
(13) 
where is the camera gain, models the prnu factor, is the exposure time, is the irradiance reaching pixel , and are the readout noise mean and variance. The camera parameters have to be estimated by a calibration procedure [20]. The noise covariance matrix is thus diagonal with entries that depend on the pixel value .
In order to evaluate the interpolation capacity of the proposed approach, we consider the pixels of the green channel only (i.e. 50% of the pixels in the RGGB Bayer pattern) and interpolate the missing values. We compare the results to those obtained using an adaptation of PLE to images degraded with noise with variable variance (PLEV) [28]. The results for the EPLL and EPLE methods are not presented here since these methods are not suited for this kind of noise. Figure 6 shows extracts of the obtained results (see Figure 5 for a JPEG version of the raw image showing the location of the extracts). As it was already observed in the synthetic data experiments, fine details and edges are better preserved. Compare for example the reconstruction of the balcony edges and the wall structure in the first row of Figure 6, as well as the structure of the roof and the railing in the second row of the same image.
IvC Discussion
In all the previous experiments, the results obtained with HBE outperform or are very close to those obtained by the other evaluated methods. Visually, details are better reconstructed and images are sharper.
We interpret this as the inability of a fixed set of patch classes (19 for PLE) to accurately represent patches that seldom appear in the image, such as edges or textures (as in Barbara’s trouser). The fact that methods such as PLE are actually semilocal (classes are estimated on regions [10]) does not solve this issue. On the contrary, a local model estimation as the one performed by HBE correctly handles those cases.
The performance difference is much more remarkable for the higher masking rates. In such cases, the robustness of the estimation is crucial. Indeed the proposed method performs the estimation from similar patches in a local window. The hypothesis of selfsimilarity being reinforced by considering local neighbourhoods, such a strategy restricts the possible models, therefore making the estimation more robust. Furthermore, the local model estimation, previously shown to be successful at describing patches [7], gives a better reconstruction even when a very large part of the patch is missing.
EPLL uses more mixture components (200 components are learnt from patches of natural images [9]) in its GMM model than PLE. It was observed in [8] that this strategy is less efficient than PLE for the denoising task. In this work, we also observe that the proposed approach outperforms EPLL, not only in denoising, but also in inpainting and zooming. However, here it is harder to tell if the improvement is due to the local model estimation performed from similar patches or to the different restoration strategies followed by these methods.
V High dynamic range imaging from a single snapshot
In this section, we apply the proposed framework to generate HDR images from a single shot. HDR imaging aims at reproducing an extended dynamic range of luminosity compared to what can be captured using a standard digital camera, which is often not enough to produce an accurate representation of real scenes. In the case of a static scene and a static camera, the combination of multiple images with different exposure levels is a simple and efficient solution [29, 30, 27]. However, several problems arise when either the camera or the elements in the scene move [31, 32].
An alternative to the HDR from multiple frames is to use a single image with spatially varying pixel exposures (SVE) [18]. An optical mask with spatially varying transmittance is placed adjacent to a conventional image sensor, thus controlling the amount of light that reaches each pixel (see Figure 7) [18, 33, 34].
The greatest advantage of this acquisition method is that it avoids the need for image alignment and motion estimation. Another advantage is that the saturated pixels are not organized in large regions. Indeed, some recent multiimage methods tackle motion problems by taking a reference image and then by estimating motion or reconstructing the image relative to this reference [35, 31]. A problem encountered by these approaches is the need to inpaint very large saturated and underexposed regions in the reference frame. The SVE acquisition strategy avoids this problem since, in general, all scene regions are sampled by at least one of the exposures.
Taking advantage of the ability of the proposed framework to simultaneously estimate missing pixels and denoise wellexposed ones, we propose a novel approach to generate HDR images from a single shot acquired with spatially varying pixel exposures. The proposed approach shows significant improvements over existing methods.
Va Spatially varying exposure acquisition model
An optical mask with spatially varying transmittance is placed adjacent to a conventional image sensor to give different exposure levels to the pixels. This optical mask does not change the acquisition process of the sensor. Hence, the noise model (13) can be adapted to the SVE acquisition by including the perpixel SVE gain ^{4}^{4}4Some noise sources not modeled here, such as blooming, might have an impact in the SVE acquisition strategy and should be considered in a more accurate image modeling.:
(14) 
In the approach proposed by Nayar and Mitsunaga [18], the varying exposures follow a regular pattern. Motivated by the aliasing problems of regular sampling patterns, Schöberl et al. [36] propose to use spatially varying exposures on a nonregular pattern. Figure 7 shows examples of both acquisition patterns. This observation led us to choose the nonregular pattern in the proposed approach.
VB Hyperprior Bayesian Estimator for Single Shot High Dynamic Range Imaging
VB1 Problem statement
In order to reconstruct the dynamic range of the scene we need to solve an inverse problem. We want to estimate the irradiance image from the SVE image , knowing the exposure levels of the optical mask and the camera parameters. For this purpose we map the raw pixel values to the irradiance domain with
(15) 
We take into account the effect of saturation and underexposure by introducing the exposure degradation matrix , a diagonal matrix given by
(16) 
with equal to the pixel saturation value. From (14) and (16), can be modeled as
(17) 
Notice that (17) is the distribution of for a given exposure degradation factor , since is itself a random variable that depends on . The exposure degradation factor must be included in (17) since the variance of the over or under exposed pixels no longer depends on the irradiance but is only due to the readout noise . From (17) we have
(18) 
where is zeromean Gaussian noise with diagonal covariance matrix given by
(19) 
Then the problem of irradiance estimation can be stated as retrieving from , which implies denoising the wellexposed pixel values () and estimating the unknown ones ().
VB2 Proposed solution
VC Experiments
The proposed reconstruction method was thoroughly tested in several synthetic and real data examples. A brief summary of the results is presented in this section.
VC1 Synthetic data
Sample images are generated according to Model (18) using the HDR image in Figure 7 as the groundtruth. Both a random and a regular pattern with four equiprobable exposure levels are simulated. The exposure time is set to seconds and the camera parameters are those of a Canon 7D camera set to ISO 200 (, , , ) [27].
Figure 7 shows extracts of the results obtained by the proposed method, by PLEV [28] (basically an adaptation of PLE to the same single image framework) and by Schöberl et al. [36] for the random pattern and by Nayar et Mitsunaga [18] using the regular pattern. The percentage of unknown pixels in the considered extracts is 50% (it is nearly the same for both the regular and nonregular pattern). Table II shows the PSNR values obtained in each extract marked in Figure 7. The proposed method manages to correctly reconstruct the irradiance on the unknown pixels. Moreover, its denoising performance is much better than that of Schöberl et al. and Nayar and Mitsunaga, and still sharper than PLEV.
VC2 Real data
The feasibility of the SVE random pattern has been shown in [34] and that of the SVE regular pattern in [33]. Nevertheless, these acquisition systems are still not available for general usage.^{5}^{5}5 While writing the last version of this article the authors got aware of Sony’s latest sensor IMX378. This sensor has a special mode called SMEHDR, which is a variation of the SVE acquisition principle. Whereas this sensor was adopted by the Google Pixel smartphone in 2016, the special SMEHDR mode is never activated by the Google Pixel phone, according to experts from the company DxO, and we found no way to activate it and access the raw image. However, as stated in Section VA, the only variation between the classical and the SVE acquisition is the optical filter. Hence, the noise at a pixel captured using SVE with an optical gain factor and exposure time and a pixel captured with a classical camera using exposure time should be very close. We take advantage of this fact in order to evaluate the reconstruction performance of the proposed approach using real data.
For this purpose, we generate an SVE image from four raw images acquired with different exposure times. The four different exposure times simulate four different filters of the SVE optical mask. The value at position in is chosen at random among the four available values at that position . Notice that the Bayer pattern is kept on by construction. The images are acquired using a remotely controlled camera and a tripod so as to be perfectly aligned. This protocol does not allow us to take scenes with moving objects. Let us emphasize, however, that using a real SVE device, this, as well as the treatment of moving camera, would be a nonissue.
Figures 8 and 9 show the results obtained from two real scenes, together with the masks of wellexposed (white) and unknown (black) pixels (the SVE raw images are included in Appendix B in the supplementary material). Recall that among the unknown pixels, some of them are saturated and some of them are under exposed. Square patches of size 6 and 8 were used for the examples in Figure 9 and Figure 8 respectively. Demosaicing [37] and tone mapping [38] are used for displaying purposes.
We compare the results to those obtained by PLEV [28]. A comparison against the methods by Nayar and Mitsunaga and Schöberl et al. is not presented since they do not specify how to treat raw images with a Bayer pattern. The proposed method manages to correctly reconstruct the unknown pixels even in extreme conditions where more than of the pixels are missing, as for example the last extract in Figure 9.
These examples show the suitability of the proposed approach to reconstruct the irradiance information in both very dark and bright regions simultaneously. See for instance the example in Figure 9, where the dark interior of the building (which can be seen through the windows) and the highly illuminated part of another building are both correctly reconstructed (see the electronic version of the article for better visualization).
Vi Conclusions
In this work we have presented a novel image restoration framework. It has the benefits of local patch characterization (that was key to the success of NLB as a state of the art denoising method), but manages to extend its use to more general restoration problems where the linear degradation operator is diagonal, by combining local estimation with Bayesian restoration based on hyperpriors. This includes problems such as zooming, inpainting and interpolation. In this way, all these restoration problems are set under the same framework. It does not include image deblurring or deconvolution, since the degradation operator is no longer diagonal. Correctly addressing deconvolution with large kernels with patchbased approaches and Gaussian prior models is a major challenge that will be the subject of future work.
We have presented a large series of experiments both on synthetic and real data that confirm the robustness of the proposed strategy based on hyperpriors. These experiments show that for a wide range of image restoration problems HBE outperforms several stateoftheart restoration methods.
This work opens several perspectives. The first one concerns the relevance of the Gaussian patch model and its relation to the underlying image patches manifold. If this linear approximation has proven successful for image restoration, its full relevance in other areas remains to be explored, especially in all domains requiring to compare image patches. Another important related question is the one of the estimation of the degradation model in images jointly degraded by noise, missing pixels, blur, etc. Restoration approaches generally rely on the precise knowledge of this model and of its parameters. In practice however, we often deal with images for which the acquisition process is unknown, and that have possibly been affected by posttreatments. In such cases, blind restoration remains an unsolved challenge.
Finally, we have presented a novel application of the proposed general framework to the generation of HDR images from a single variable exposure (SVE) snapshot. The SVE acquisition strategy allows the creation of HDR images from a single shot without the drawbacks of multiimage approaches, such as the need for global alignment and motion estimation to avoid ghosting problems. The proposed method manages to simultaneously denoise and reconstruct the missing pixels, even in the presence of (possibly complex) motions, improving the results obtained by existing methods. Examples with real data acquired in very similar conditions to those of the SVE acquisition show the capabilities of the proposed approach.
Appendix A Alternate minimization scheme convergence
We study in the following the convergence of the alternate minimization algorithm 1. To show the main convergence result, we need the following lemma
Lemma 1.
The function is coercive on .
Proof.
We need to show that
Now, if and only if or or .
The matrix being positivedefinite, the terms and are both positive. Thus
Now, this function of is convex and coercive on , which implies that as soon as . It also follows that the previous function of has a global minimum that we denote by . We can now write
and this function of clearly tends towards as soon as or . ∎
We now show the main convergence result for our alternate minimization algorithm. The proof adapts the arguments in [21] to our case.
Proposition 3.
The sequence converges monotonically when . The sequence generated by the alternate minimization scheme has at least one accumulation point. The set of its accumulation points forms a connected and compact set of partial optima and stationary points of , all having the same function value.
Proof.
The sequence obviously decreases at each step by construction. The coercivity and continuity of imply that this sequence is also bounded from below, and thus converges. The convergence of implies that the sequence is bounded. It follows that it has at least one accumulation point and that there exists a strictly increasing sequence of integers such that converges to .
Now, we can show that such an accumulation point is a partial optimum of , i.e. that attains its minimum at and attains its minimum at . By construction,
which implies by continuity of that
(20) 
Let us denote with
The alternate minimization scheme consists in updating at each iteration. From Equations (6), (7), and (8), we see that is explicit and continuous. Since is strictly increasing, for each , . The sequence decreases, so
Therefore, as , since is continuous, it follows that
Now, writing , we get
We can conclude that all these terms are equal and in particular
(21) 
From (20) and (21) we deduce that the accumulation point is a partial optimum of and since is differentiable, it is also a stationary point of . Moreover, since is strictly convex and has a unique minimum, it follows from (21) that . As a consequence, . Therefore, the accumulation point (and actually any accumulation point of the sequence) is also a fixed point of function .
We have shown that accumulation points of the sequence are partial optima of and fixed points of the function . The set of accumulation points is obviously compact. Let us show that it is also a connected set. First, observe that whatever the norm , the sequence converges to when . If it was not the case, it would be possible to extract a subsequence converging to an accumulation point while converges to a different accumulation point , but we know that it is impossible since would also tend toward . The sequence being bounded and such that converges to , the set of its accumulation points is connected (see [39]). The fact that all accumulation points have the same function value is obvious since the sequence decreases. ∎
Acknowledgement
We would like to thank the reviewers for their thorough and fruitful contributions, as well as Mila Nikolova and Alasdair Newson for their insightful comments and the authors of [26, 25, 24] for kindly providing their code. This work has been partially funded by the French Research Agency (ANR) under grant nro ANR14CE27001 (MIRIAM), and by the “Investissement d’avenir” project, reference ANR11LABX0056LMH.
References
 [1] L. I. Rudin, S. Osher, and E. Fatemi, “Nonlinear total variation based noise removal algorithms,” Physica D: Nonlinear Phenomena, vol. 60, no. 1, pp. 259–268, 1992.
 [2] J. Portilla, V. Strela, M. J. Wainwright, and E. P. Simoncelli, “Image denoising using scale mixtures of gaussians in the wavelet domain,” IEEE Transactions on Image processing, vol. 12, no. 11, pp. 1338–1351, 2003.
 [3] M. Elad and M. Aharon, “Image denoising via sparse and redundant representations over learned dictionaries,” IEEE Transactions on Image processing, vol. 15, no. 12, pp. 3736–3745, 2006.
 [4] A. Buades, B. Coll, and J. M. Morel, “A Review of Image Denoising Algorithms, with a New One,” SIAM Multiscale Modeling & Simulation, vol. 4, no. 2, pp. 490–530, 2005.
 [5] S. Lyu and E. Simoncelli, “Modeling Multiscale Subbands of Photographic Images with Fields of Gaussian Scale Mixtures,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 31, no. 4, pp. 693–706, 2009.
 [6] P. Chatterjee and P. Milanfar, “PatchBased NearOptimal Image Denoising,” IEEE Transactions on Image Processing, vol. 21, no. 4, pp. 1635–1649, 2012.
 [7] M. Lebrun, A. Buades, and J. Morel, “A nonlocal bayesian image denoising algorithm,” SIAM Journal on Imaging Sciences, vol. 6, no. 3, pp. 1665–1688, 2013.
 [8] Y.Q. Wang and J.M. Morel, “SURE Guided Gaussian Mixture Image Denoising,” SIAM Journal on Imaging Sciences, vol. 6, no. 2, pp. 999–1034, 2013.

[9]
D. Zoran and Y. Weiss, “From learning models of natural image patches to whole
image restoration,” in
Proceedings of IEEE International Conference on Computer Vision (ICCV)
, 2011, pp. 479–486.  [10] G. Yu, G. Sapiro, and S. Mallat, “Solving inverse problems with piecewise linear estimators: From gaussian mixture models to structured sparsity,” IEEE Transactions on Image Processing, vol. 21, no. 5, pp. 2481–2499, 2012.
 [11] M. Lebrun, M. Colom, A. Buades, and J. Morel, “Secrets of image denoising cuisine,” Acta Numerica, vol. 21, no. 1, pp. 475–576, 2012.
 [12] Y.Q. Wang, “The Implementation of SURE Guided Piecewise Linear Image Denoising,” Image Processing On Line, vol. 3, pp. 43–67, 2013.
 [13] A. Gelman, J. B. Carlin, H. S. Stern, and D. B. Rubin, Bayesian data analysis. Chapman & Hall/CRC Boca Raton, FL, USA, 2014, vol. 2.
 [14] R. Molina, “On the hierarchical bayesian approach to image restoration: applications to astronomical images,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 16, no. 11, pp. 1122–1128, 1994.
 [15] R. Molina, A. K. Katsaggelos, and J. Mateos, “Bayesian and regularization methods for hyperparameter estimation in image restoration,” IEEE Transactions on Image Processing, vol. 8, no. 2, pp. 231–246, 1999.
 [16] N. Dobigeon, J.Y. Tourneret, and C.I. Chang, “Semisupervised linear spectral unmixing using a hierarchical bayesian model for hyperspectral imagery,” IEEE Transactions on Signal Processing, vol. 56, no. 7, pp. 2684–2695, 2008.
 [17] F. Orieux, J. Giovannelli, and T. Rodet, “Bayesian estimation of regularization and point spread function parameters for wiener–hunt deconvolution,” J. Opt. Soc. Am. A, vol. 27, no. 7, pp. 1593–1607, Jul 2010.

[18]
S. Nayar and T. Mitsunaga, “High Dynamic Range Imaging: Spatially
Varying Pixel Exposures,” in
Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
, vol. 1, Jun
Comments
There are no comments yet.