Non-Uniform Blind Deblurring with a Spatially-Adaptive Sparse Prior

06/17/2013 ∙ by Haichao Zhang, et al. ∙ 0

Typical blur from camera shake often deviates from the standard uniform convolutional script, in part because of problematic rotations which create greater blurring away from some unknown center point. Consequently, successful blind deconvolution requires the estimation of a spatially-varying or non-uniform blur operator. Using ideas from Bayesian inference and convex analysis, this paper derives a non-uniform blind deblurring algorithm with several desirable, yet previously-unexplored attributes. The underlying objective function includes a spatially adaptive penalty which couples the latent sharp image, non-uniform blur operator, and noise level together. This coupling allows the penalty to automatically adjust its shape based on the estimated degree of local blur and image structure such that regions with large blur or few prominent edges are discounted. Remaining regions with modest blur and revealing edges therefore dominate the overall estimation process without explicitly incorporating structure-selection heuristics. The algorithm can be implemented using a majorization-minimization strategy that is virtually parameter free. Detailed theoretical analysis and empirical validation on real images serve to validate the proposed method.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 3

page 12

page 16

page 17

page 18

page 19

page 20

page 21

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Image blur is an undesirable degradation that often accompanies the image formation process and may arise, for example, because of camera shake during acquisition. Blind image deblurring strategies aim to recover a sharp image from only a blurry, compromised observation. Extensive efforts have been devoted to the uniform blur (shift-invariant) case, which can be described with the convolutional observation model

(1)

where denotes 2D convolution, is the unknown sharp image, is the observed blurry image, is the unknown blur kernel (or point spread function), and is a zero-mean Gaussian noise term with covariance  [1, 2, 3, 4, 5, 6, 7, 8]. Unfortunately, many real-world photographs contain blur effects that vary across the image plane, such as when unknown rotations are introduced by camera shake [3].

More recently, algorithms have been generalized to explicitly handle some degree of non-uniform blur using the more general observation model

(2)

where now (with some abuse of notation) and

represent vectorized sharp and blurry images respectively and each column of the blur operator

contains the spatially-varying effective blur kernel at the corresponding pixel site  [9, 10, 11, 12, 13, 14, 15, 16, 17]. Note that the original uniform blur model can be achieved equivalently when is forced to adopt a simple toeplitz structure. In general, non-uniform blur may arise under several different contexts. This paper will focus on the blind removal of non-uniform blur caused by general camera shake (as opposed to blur from object motion) using only a single image, with no additional hardware assistance.

While existing algorithms for addressing non-uniform camera shake have displayed a measure of success, several important limitations remain. First, some methods require either additional specialized hardware such as high-speed video capture [18] or inertial measurement sensors [19] for estimating motion, or else multiple images of the same scene [14]. Secondly, even the algorithms that operate given only data from a single image typically rely on carefully engineered initializations, heuristics, and trade-off parameters for selecting salient image structure or edges, in part to avoid undesirable degenerate, no-blur solutions [10, 11, 12, 13]. Consequently, enhancements and rigorous analysis may be problematic. To address these shortcomings, we present an alternative blind deblurring algorithm built upon a simple, closed-form cost function that automatically discounts regions of the image that contain little information about the blur operator without introducing any additional salient structure selection steps. This transparency leads to a nearly parameter free algorithm based upon a unique, adaptive sparsity penalty and provides theoretical arguments regarding how to robustly handle non-uniform degradations. An example of estimated non-uniform or spatially-varying blur kernels is shown in Figure 1.

Fig. 1: Non-uniform blur example. Left: A blurry photo from [11]. Right: Estimated spatially-varying blur kernel array obtained using proposed method. The resulting deblurred image is shown later in Figure 6.

The rest of the paper is structured as follows. Section II briefly reviews relevant existing work on blind deblurring. Section III then introduces the proposed non-uniform blind deblurring model, while theoretical justification and analyses are provided in Section IV. Experimental comparisons with state-of-the-art methods are carried out in Section V followed by conclusions in Section VI.

Ii Related Work

Perhaps the most direct way of handling non-uniform blur is to simply partition the image into different regions and then learn a separate, uniform blur kernel for each region, possibly with an additional weighting function for smoothing the boundaries between two adjacent kernels. The resulting algorithm admits an efficient implementation called efficient filter flow (EFF) [20, 21] and has been adopted extensively [22, 11, 15, 16, 17]. The downside with this type of model is that geometric relationships between the blur kernels of different regions derived from the the physical motion path of the camera are ignored.

In contrast, to explicitly account for camera motion, the projective motion path (PMP) model [18] treats a blurry image as the weighted summation of projectively transformed sharp images, leading to the revised observation model

(3)

where is the -th projection or homography operator (a combination of rotations and translations) and is the corresponding combination weight representing the proportion of time spent at that particular camera pose during exposure. The uniform convolutional model can be obtained by restricting the general projection operators to be translations. In this regard, (3) represents a more general model that has been used in many recent non-uniform deblurring efforts [18, 9, 10, 13, 14]. PMP also retains the bilinear property of uniform convolution, meaning that

(4)

where and is a matrix of transformed sharp images.

The disadvantage of PMP is that it typically leads to inefficient algorithms because the evaluation of the matrix-vector product requires generating many expensive intermediate transformed images. However, EFF can be combined with the PMP model by introducing a set of basis images efficiently generated by transforming a grid of delta peak images [12]. The computational cost can be further reduced by using an active set for pruning out the projection operators with small responses [13]

. Furthermore, while the projective transformations (homographies) can generally involve six degrees-of-freedom (three for rotation and three for translation), recent work has demonstrated the effectiveness of using lower-dimensional restricted forms. For example, a 3D rotational camera motion model (i.e., using on roll, pitch, and yaw, no translations) is considered in 

[9]. Likewise, a 3D camera motion model with -translations and in-plane rotations has been used successfully in several other deblurring algorithms [10, 12, 13]. These two approximations display similar performance for sufficiently long focal lengths due to rotation-translation ambiguity in this setting [10].

Iii A New Non-Uniform Deblurring Model

Following previous work [1, 23], we work in the derivative domain of images for ease of modeling and better performance, meaning that and will denote the lexicographically ordered sharp and blurry image derivatives respectively.111The derivative filters used in this work are . Other choices are also possible. We will now derive a new non-uniform deblurring cost function followed by a majorization-minimization algorithm.

Iii-a Cost Function Derivation

The observation model (3) is equivalent to the likelihood function

(5)

Maximum likelihood estimation of and using (5) is clearly ill-posed and so further regularization is required to constrain the solution space. For this purpose, we adopt a sparse prior on (in the gradient domain) as advocated in [1, 9]. We assume the factorial form for this prior, where

(6)

which represents a weighted maximization over zero-mean Gaussians with different variances

. Here is some non-negative energy function, with different selections producing different priors on . While it has been shown in [24] that any prior expressible in the form of (6) will be super-Gaussian (sparsity promoting), we will rely on the special case where for our model. This selection has been advocated in other applications of sparse estimation [25], has the advantage of being parameter free, and leads to a particularly compelling algorithm as will be shown below.

The hyperparameter variances

provide a convenient way of implementing several different estimation strategies [24]. For example, perhaps the most straightforward is a form of MAP estimation given by

(7)

where simple update rules are available via coordinate ascent over , , and (a prior can also be included on or if desired). However, recently it has been argued that an alternative estimation procedure may be preferred for canonical sparse linear inverse problems [25]. The basic idea, which naturally extends to the blind deconvolution problem, is to first integrate out , and then optimize over , , as well as the noise level . The final latent sharp image can then be recovered using the estimated kernel and noise level along with standard non-blind deblurring algorithms. Later we will provide rigorous, independent rationalization for why the objective function produced through this process is ultimately superior to standard MAP.

Mathematically, this alternative estimation scheme requires that we solve

(8)

where . While optimizing (8) is possible using various general techniques such as the EM algorithm, it is computationally expensive in part because of the high-dimensional determinants involved with realistic-sized images. Therefore, we instead minimize a convenient upper bound allowing us to circumvent this issue. Specifically, using standard determinant identities we have

(9)

where . Here denotes the -th column of and represents the local blur kernel vector associated with pixel location in the image plane. The local kernel can be calculated by

(10)

where denotes an all-zero image with a 1 at site , and . Consequently we have for the norm embedded in (9). The use of this diagonal approximation will not only make the proposed model computationally tractable, but it will also lead to an effective deblurring algorithm, as will be verified by the extensive experimental results in Section V.

While optimizing (8) using the bound from (9) can be justified in part using Bayesian-inspired arguments, the -dependent cost function is far less intuitive than the standard penalized regression models dependent on that are typically employed for blind deblurring. However, using the framework from [25], it can be shown that the kernel estimate obtained by this process is formally equivalent to the one obtained via

(11)

where

(12)

and

(13)

The optimization from (11) closely resembles a standard penalized regression (or equivalently MAP) problem used for blind deblurring. The primary distinction is the penalty term , which jointly regularizes , , and . We discuss a tractable algorithm for optimization that accounts for this intrinsic coupling in Section III-B, followed by analysis in Section IV.

1:  Input: a blurry image
2:  Initialize: blur parameter vector , noise level , and
3:  While stopping criteria is not satisfied, do
  • Image Update:

     where

  • Latent Representation Update:

    , with

  • Blur Update:

  • Noise Level Update:

    , with

4:  End
Algorithm 1 Non-Uniform Blind Deblurring.

Iii-B Minimization Algorithm

The proposed practical blind deblurring algorithm simply involves solving (11). This can be accomplished by instead minimizing a convenient upper bound defined as

(14)

where is a vector of latent variables controlling the shape of the bound. The form of (14) is motivated by the fact that the proposed penalty function satisfies

(15)

This expression can be shown by optimizing over , plugging in the resulting value which can be obtained in closed-form, and then simplifying. From (15) it then follows that

(16)

for all , with equality when each solves (15). Consequently we can solve (11) iteratively by minimizing in an alternating fashion over , , , and . This majorization-minimization technique [26, 25] has similar convergence properties to the EM algorithm. The resulting procedure is summarized in Algorithm 1. The details of each constituent subproblem are derived in Appendix A.

Algorithm 1 is very straightforward. The image and blur are updated by solving simple quadratic minimization problems. The update rules for the latent variables and the noise level are also minimally complex.

For simplicity in practice, we have only used projection operators involving in-plane translations and rotations similar to [10] for modeling the camera shake, and use the EFF model [11] for reducing the computational expense. We have also incorporated the technique similar to the one used in [13]

, whereby irrelevant projection operators are pruned out while some new ones are added by sampling around the remaining projections using a Gaussian distribution with a small variance. Note that this heuristic is only for reducing the computational complexity; using the fully sampled basis set would generate the best results. Also, a standard multi-scale estimation scheme is incorporated consistent with most recent blind deblurring work

[1, 23, 9, 12].

Finally, we emphasize that Algorithm 1 only provides an estimate of in the gradient domain. Consequently, consistent with other methods we use the estimated blur parameters in a final non-blind deconvolution step to recover the latent sharp image.

Iv Theoretical Analysis

The proposed blind deblurring strategy involves simply minimizing (11); no additional steps for structure or salient edge detection are required unlike other state-of-the-art approaches. This section will examine theoretical properties of the proposed penalty function embedded in (11) that ultimately allows such a simple algorithm to succeed. We note that, unlike prototypical penalized sparse regression models used for blind deblurring, is non-separable with respect to the image and the blur parameters , meaning that it cannot be decomposed as for some functions and . Moreover, it also depends on the noise level , a novel dependency with important consequences as shown below.

To address these distinctions, we will examine from two complementary perspectives. First, we will treat as a function of parameterized by and , and then subsequently we will treat it as a function of parameterized by and . This will ultimately serve to demonstrate that the intrinsic coupling is highly advantageous over any separable functions and .

Iv-a The Effective Penalty on

For analysis purposes we first introduce the definition of relative concavity [27].

Definition 1 (Relative Concavity)

Let be a strictly increasing function on . The function is concave relative to on the interval if and only if holds .

We will use to denote that is concave relative to on . This can be understood as a natural generalization of the traditional notion of a concavity, in that a concave function is equivalently concave relative to a linear function per Definition 1. In general, if , then when and are set to have the same functional value and the same slope at any given point (i.e., by an affine transformation of ), then lies completely under .

Now consider the function defined as

(17)

It follows then that

(18)

where . The second summation in (18) is independent of , so here we will focus on the first term, which suggests that the penalty function shape over the image depends only on this ratio of noise level to the squared norm of the local kernel. This leads to some desirable properties relevant to blind deblurring. Figure 2 shows how varies its shape with .

Fig. 2: (a) A 1D example of the coupled penalty (normalized) with different values. The norm is included for comparison. (b) A 2D example surface plot of the coupled penalty function .
Theorem 1 (Spatially-Adaptive Sparsity)

The proposed penalty satisfies:

  1. is a concave, non-decreasing function of for all .

  2. If , then and .

The proof has been deferred to Appendix B. The first property of Theorem 1 implies that the derived penalty function favors solutions with some entries exactly equal to zero.222When and are fixed, optimization of using (11) falls into the canonical form , where is a concave, non-decreasing function by virtue of Theorem 1. Such problems will provably have some elements of equal to zero if is overcomplete and/or if is sufficiently large [28, 25]. In this sense it is similar to more traditional penalty functions based on the pseudo-norm , , or other related sparsity measures. Consequently, the more unique attributes of stem from the second property of Theorem 1, which leads to a desirable spatially-adaptive form of sparsity.

To understand the significance of these properties, it helps to review an important practical consideration involved when designing robust deblurring systems. First, blind deconvolution algorithms applied to deblurring are heavily dependent on some form of stagewise coarse-to-fine approach, whereby the blur operator is repeatedly re-estimated at successively higher resolutions. At each stage, a lower resolution version is used to initialize the estimate at the next higher resolution. One way to implement this approach is to initially use large values of such that only dominant, primarily low-frequency image structures dictate the optimization [3]. During subsequent iterations as the blur operator begins to reflect the correct coarse shape, can be gradually reduced to allow the recovery of more detailed, fine structures.

A highly sparse (concave) prior can ultimately be more effective in differentiating sharp images and fine structures than a convex one. Detailed supported evidence for this claim can be found in [1, 29, 30, 31]. However, if such a prior is applied at the initial stages of estimation, the iterations are likely to become trapped at suboptimal local minima, of which there will always be a combinatorial number. Moreover, in the early stages, the effective noise level is actually high due to errors contained in the estimated blur kernel, and exceedingly sparse image penalties are likely to produce unstable solutions.

Theorem 1 implies that the proposed method may implicitly avoid these problems by initializing with a large (and therefore a large ), such that the penalty function is initially nearly convex in at all pixels . As the iterations proceed and coarse structures are resolved, the effective noise level (or modeling error) reduces, along with the estimated . Consequently, later when fine structures need to be resolved, the penalty function becomes less convex as is automatically reduced by the learning process, but the risk of local minima and instability is ameliorated by the fact that we are likely to be already in the neighborhood of a desirable basin of attraction.

The form of image penalty adaptation just described occurs globally across all pixels. However, a more interesting and nuanced shape adaptation occurs regionally based on differences in the local blur estimate , which also affects the pixel-wise parameter . Recall that can be viewed as the local blur kernel around pixel , meaning that in this local region the blurry image can be roughly modeled as , where denotes the standard 2D convolution. Given the feasible simplex and commonly assumed for blind deblurring, it can be shown that , where is the maximum number of pixels in any local kernel. The upper bound is achieved when the local kernel is a delta solution, meaning only one nonzero element and therefore minimal blur. This scenario produces the highest relative concavity (i.e., sparsity) by virtue of Theorem 1 since will be minimized. Such a high degree of sparsity is warranted here because there is little risk of local minima in regions with such a simple kernel and the added concavity can help to differentiate small-scale structures necessary for obtaining a globally reasonable solution. Note also that if we estimate the correct based on a few local regions, then the overall deblurred image will be sharp to the extent that our forward model is correct.

In contrast, the lower bound on occurs when every element of has an equal value. Now is maximized and the relative concavity is minimal, meaning

is the closest to being convex. Again, this represents a desirable tuning mechanism. A uniformly distributed

indicates maximal blur, and therefore higher risk for local minima. Moreover, in such regions, only dominate edges/structures will remain, and so a nearly convex penalty is sufficient for disambiguation of residual coarse details. Moreover, because of property two, not only is nearly convex, but its slope is also minimal, meaning the influence to the overall cost function is also minimized. Consequently, regions with smaller local blur kernels and significant edges will automatically dominate the image penalty, while flat regions or areas with large blur will be discounted. Importantly, this spatially-adaptive sparsity occurs without the need for additional structure selection measures, meaning carefully engineered heuristics designed to locate prominent edges such that good global solutions can be found with minimally non-convex image penalties [2, 4, 5, 11, 12].

Figure 3 presents example deblurring results on the real-world Elephant image from [11] both with and without the described spatially-adaptive sparsity mechanism. We also display the corresponding image of values in Figure 4, which determines which regions of the estimated sharp image will have the greatest impact on the cost function for the spatially-adaptive case. For purely uniform blur, this -map would be constant neglecting small boundary effects, while for rotational blur, it would be smallest at the rotation center, and larger on the periphery. The learned -map from a real image is refined across the coarse-to-fine hierarchy and reflects a combination of rotations and translations, modulating the relatively concavity using the inverse of the estimated local kernel spread. Importantly, if we remove this spatial adaptation, and instead substitute the fixed norm for all , the performance degrades as shown in Figure 3.

Fig. 3: Effectiveness of spatially adaptive sparsity. The deblurred images and estimated kernel maps obtained using the proposed spatially adaptive sparsity (left) and standard spatially non-adaptive sparsity (right) for the Elephant image shown in Figure 6.
Fig. 4: Map of as estimated by the proposed algorithm at different resolutions using the blurry Elephant image shown in Figure 6. This map reflects the degree of local blurring which ultimately controls the shape of the image penalty.

Iv-B The Effective Penalty on

We may also consider as a function of with shape modulated by and as well as the basis functions , leading to an interesting, complementary perspective. With this intent in mind, we define

(19)

where denotes the weighted quadratic norm (and so it follows that ). By definition we then have

(20)

where . Note that because many may equal zero (in regions with zero gradient), we must define the shape parameters differently from the previous section. Moreover, while certain symmetries exist between the effective penalties on and , the analysis and interpretations turn out to be significantly divergent.

Interestingly, is completely blind to image regions determined to be relatively flat. More specifically, if a gradient is zero, then is zero and contributes no penalty on . Consequently, as estimation proceeds and coarse image gradient estimates become available, the blur operator penalty is increasingly dominated by edges and structured image areas. However, it is important to examine how the shape of changes depending on where and how these edges are distributed relative to local blurring as dictated by each .

Simply put, if the majority of large gradients occur near the center of some rotations, then the penalty will provably become nearly flat. This occurs because the corresponding

for such regions will be nearly a zero matrix with a single row of ones (and if location

is directly in the rotation center, it will be exactly so). Given the constraint , any feasible will then necessarily produce almost the same value, and hence will be relatively flat. This is consistent with the intuition that minimal kernel regularization is required when there is limited blurring of the primary edges.

In contrast, when the majority of significant gradients occur in areas with large local blurring (as a combination of translations and distant rotations), then the kernel penalty will impose strong quadratic regularization on . This can be explained by noting that

will be approximately an identity matrix for translations (ignoring boundary effects) and distant rotations (which behave like translations far from the rotation center). This is also desirable consequence since a relatively diffuse blurring operator will be needed to resolve such edges.

Thus ultimately, the penalty on transitions between a form of quadratic regularizer, which favors many nonzero elements of suitable for characterizing larger blur, and no penalty at all (within the specified constraint set). Moreover, this adaptive regularization is processed using a non-linearity in such that data-fit and kernel penalties are properly balanced. By this we mean that if the image gradients are scaled by some factor (i.e., ), then the which solves

(21)

will simply be scaled by the same factor . Because (21) represents the cost function from (11) with fixed, this form of invariance helps to explain why the proposed algorithm is largely devoid of trade-off parameters that are typically used to calibrate the kernel penalty. Note that both the nonlinearity with respect to the norms and the -dependency in contribute to this invariance while simultaneously maintaining an integrated cost function over both and (which is easily shown to be globally scale invariant as well).

V Experiments

This section compares the proposed method with several state-of-the-art algorithms for both uniform and non-uniform blind deblurring using real-world images.

V-a Uniform Deblurring

Any non-uniform deblurring approach should naturally reduce to an effective uniform algorithm when the blur transformations are simply in-plane translations. We first evaluate our algorithm in the uniform case where existing benchmarks facilitate quantitative comparisons with state-of-the-art methods. For this purpose we reproduce the experiments from [23] using the benchmark test data from [3],333http://www.wisdom.weizmann.ac.il/~levina/papers/LevinEtalCVPR09Data.rar which consists of 4 base images of size and 8 different blurring effects, leading to a total of 32 blurry images. Ground truth blur kernels were estimated by recording the trace of focal reference points on the boundaries of the sharp images. The kernel sizes range from to . We compare the proposed method with only in-plane translation, with the algorithms of Shan et al. [2], Xu et al. [5], Cho et al. [4], Fergus et al. [1] and Levin et al. [23].

The SSD (Sum of Squared Difference) metric defined in [3] is used for measuring the error between estimated and the ground-truth images. To normalize for the fact that a harder kernel gives a larger image reconstruction error even when the true kernel is known (because the corresponding non-blind deconvolution problem is also harder), the SSD ratio between the image deconvolved with the estimated kernel and the image deconvolved with the ground-truth kernel is used as the final evaluation measure. The cumulative histogram of the error ratios is shown in Figure 5. The height of the bar indicates the percentage of images having error ratio below that level. Higher bars indicate better performance, revealing that the proposed method significantly outperforms existing methods on uniform deblurring tasks.

Fig. 5: Evaluation of uniform deblurring results using cumulative histogram of the deconvolution error ratios across 32 test examples from [3]. The height of the bar indicates the percentage of images having error ratio below that level. Higher bars indicate better performance.
Fig. 6: Non-uniform deblurring comparisons with Harmeling et al. [11] and Hirsch et al. [12] using the three real-world test images Butchershop, Vintage-car, and Elephant provided in [11]. Additionally, ground-truth local blur kernels associated with each of these deblurring results are shown in Figure 7 below.
Fig. 7: Non-uniform kernel estimation comparisons associated with the Butchershop, Vintage-car, and Elephant images from Figure 6 above. Note that kernels have been resized for display purposes.

V-B Non-uniform Deblurring

For non-uniform deblurring, quantitative comparisons are much more difficult because of limited benchmark data with available ground truth. Moreover, because source code for most state-of-the-art non-uniform algorithms is not available, it is not feasible to even qualitatively compare all methods across a wide range of images. Consequently, the only feasible alternative is simply to visually compare our algorithm using images contained in previously published papers with the deblurring results presented in those papers. In this context, a successful non-uniform blind deblurring algorithm is one that consistently performs comparably or better than all existing algorithms on the respective images where these algorithms have been previously tested. This section strongly suggests that the proposed approach is such a successful algorithm, even without any effort to optimize the non-blind deconvolution step (which is required after kernel estimation as mentioned previously).

Comparisons with Harmeling et al. [11] and Hirsch et al. [12]: Figure 6 displays deblurring comparisons based on the Butchershop, Vintage-car, and Elephant images provided in [11]. Overall, the proposed algorithm typically reveals more fine details than the other methods, despite its simplicity and lack of salient structure selection heuristics or trade-off parameters.444Results throughout this section are better viewed electronically with zooming. Note that with these three images, ground truth blur kernels were independently estimated using a special capturing process. See [11] for more details on this process. As shown in the Figure 7 using a (or ) array for visualization, the estimated blur kernel patterns obtained from our algorithm are generally better matched to the ground truth relative to the other methods, a performance result that compensates for any differences in the non-blind step.

Comparisons with Whyte et al. [9] and Hirsch et al. [12]: We further evaluate our algorithm using the Pantheon and Statue images from [9]. Results are shown in Figure 8, where we observe that the deblurred image from Whyte et al. has noticeable ringing artifacts. In contrast, our result is considerably cleaner. On the Pantheon example the deblurring result from Whyte et al. has significant ringing artifacts while the result from Hirsch et al. seems to be suffering from some chrome distortions as indicated by the dome area of the pantheon. Our result on the other hand, has very few artifacts or chrome distortions. On the Statue image the result of Whyte et al. is generated using a blurry image paired with another additional noisy image of the same scene captured with a shorter exposure time length. Our method and that of Hirsch et al., without the benefit of such additional image data, can nonetheless generate a deblurring result with comparable quality.

Comparisons with Gupta et al. [10] and Hirsch et al. [12]: We next experiment using the test images Magazines and Building from [10], which contain large, challenging rotational blur effects. Figure 9 reveals that our algorithm contains fewer artifacts and more fine details relative to Gupta et al., and comparable results to Hirsch et al. on the Magazines image. Note that Hirsch et al. do not provide a deblurring result for the Building image.

Fig. 8: Non-uniform deblurring comparisons with Whyte et al. [9] and Hirsch et al. [12] on the real-world images Pantheon and Statue from [9].
Fig. 9: Non-uniform deblurring comparisons with Gupta et al.  [10] and Hirsch et al.  [12] on the real-world images Magazines and Building from [10]. Note that Hirsch et al. do not provide a deblurring result for the Building image.
Fig. 10: Non-uniform deblurring comparisons with Joshi [19] and Harmeling [11] using the real-world images Porsche and Sculpture provided in [19]. Note that Harmeling et al. do not provide a deblurring result for the Sculpture image.

Comparisons with Joshi et al. [19] and Harmeling et al. [11]: Joshi et al. present a deblurring algorithm that relies upon additional hardware for estimating camera motion [19]. However, even without this additional hardware assistance, our algorithm still produces a better sharp estimate of the Porsche and Sculpture images from [19], with fewer ringing artifacts and higher resolution details. See Figure 10 for the results, where Harmeling et al. have also produced results for the Porsche image.

Comparison with Cho et al. [14]: Finally, we evaluate deblurring results using the Antefix and Doll images from [14]. The method of Cho et al. requires two blurry images of the same scene as input while we ran our algorithm using only the first blurry image in each test pair. Despite this significant disadvantage, our method still produces higher quality estimates in both cases. The results are shown in Figure 11.

Fig. 11: Non-uniform deblurring comparisons with Cho et al. [14] on the real-world images Antefix and Doll from [14]. Note that the method of Cho et al. requires two blurry images as input while we ran our algorithm using only the first blurry image in each test pair.

Vi Conclusion

This paper presents a strikingly simple yet effective method for non-uniform camera shake removal based upon a principled, transparent cost function that is open to analysis and further extensions/refinements. Moreover, both theoretical and extensive empirical evidence are provided demonstrating the efficacy of the adaptive approach to sparse regularization which emerges from our model. Extending the current framework to handle multiple images and video represents a worthwhile topic for future research.

Appendix A Derivation of the Algorithm

Blind deblurring is achieved by minimizing the cost function from (11). This can be accomplished by minimizing a rigorous upper bound defined as

(22)

which is obtained by using the fact that

(23)

This expression can be shown by optimizing over , plugging in the resulting value which can be obtained in closed-form, and then simplifying. can be iteratively minimized by optimizing , , , and with similar convergence properties to the EM algorithm. The resulting procedure is summarized in Algorithm 1. We now detail each constituent subproblem.

-subproblem:

With other variables fixed, the latent image is estimated via weighted least squares giving

(24)

where

. This can be computed efficiently using EFF and fast Fourier transforms

[11].

-subproblem:

The optimization over each is separable, thus can be solved independently via

(25)

We can rewrite (25) equivalently as

(26)

where the irrelevant term has been omitted. As no closed form solution is available for (26), we instead use principles from convex analysis to form the strict upper bound

(27)

where is the concave conjugate of the concave function . It can be shown that equality in (27) is achieved when

(28)

Substituting (27) into (26), we obtain the revised subproblem

(29)

which admits the closed-form optimal solution

(30)

-subproblem:

Isolating -dependent terms produces the quadratic minimization problem

(31)

Because again there is no closed-form solution, we resort to similar bounding techniques as used above, incorporating the bound

(32)

where is the concave conjugate of the concave function . Similar to before, equality is achieved with

(33)

Plugging (32) into (31), we obtain the minimization problem

(34)

which can be solved efficiently using standard convex programming techniques.

-subproblem:

Finally, the update rule for the noise level can be obtained through similar analysis. Omitting the terms irrelevant to we must solve

(35)

where is the dimensionality of and we have added a small constant to the quadratic data fit term to prevent it from ever going to exactly zero. As before there is no closed-form solution, so we invoke the bound

(36)

where is the concave conjugate of , Equality is achieved with

(37)

Plugging (36) into (35), we obtain the problem

(38)

leading to the closed-form noise level update

(39)

Note that has a lower bound of . Thus we may set so as to reflect some expectation regarding the minimum possible amount of noise or modeling error. In practice we simple choose for all experiments.

Appendix B Proof of Theorem 1

For the first property, it is useful to re-express using the equivalent variational form

(40)

which can be verified straightforwardly by calculating the minimizing and plugging it back into (40). As is a concave, non-decreasing function of , we can always express as

(41)

where is the concave conjugate [32] of . Therefore, it follows that

(42)

Optimizing over for fixed and , the optimal solution is

(43)

Plugging this result into (42) gives

(44)

This implies that

can be expressed as a minimum over upper-bounding hyperplanes in

, with different implying different slopes. Any function expressable in this form is necessarily concave, and also non-decreasing since [32].

For the second property, we first define

(45)

Using results from convex analysis and conjugate duality, it can be shown that the minimizing for (45) represents the gradient of with respect to , meaning . Assuming , then the minimizing value of and associated with and will always satisfy , implying . This occurs because

The last term, which is monotonically increasing from to zero, implies that there is always an extra monotonically increasing penalty on , when . Since we are dealing with continuous functions here, the minimizing will therefore necessarily be smaller, thus at any point . From (45) and , we can readily compute the expression for as

(46)

Given that by definition, we therefore have .

Furthermore, we want to show that given . For this purpose it is sufficient to show that is an increasing function of , which represents an equivalent condition for relative concavity to one given by Definition 1 [27].

From (45) and (46), we can compute the explicit expression for as

(47)

Using (47) it is also straightforward to derive as

(48)

We must then show that

(49)

is an increasing function of . By neglecting irrelevant additive and multiplicative factors (and recall that from the definition of ), this is equivalent to showing that

(50)

is a decreasing function of . It is easy to check that

(51)

Therefore, is a decreasing function of , implying that is an increasing function of , completing the proof. 

References

  • [1] Rob Fergus, Barun Singh, Aaron Hertzmann, Sam T. Roweis, and William T. Freeman, “Removing camera shake from a single photograph,” in SIGGRAPH, 2006.
  • [2] Qi Shan, Jiaya Jia, and Aseem Agarwala, “High-quality motion deblurring from a single image,” in SIGGRAPH, 2008.
  • [3] Anat Levin, Yair Weiss, Frédo Durand, and William T. Freeman, “Understanding blind deconvolution algorithms,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 33, no. 12, pp. 2354–2367, 2011.
  • [4] Sunghyun Cho and Seungyong Lee, “Fast motion deblurring,” in SIGGRAPH ASIA, 2009.
  • [5] Li Xu and Jiaya Jia, “Two-phase kernel estimation for robust motion deblurring,” in ECCV, 2010.
  • [6] Dilip Krishnan, Terence Tay, and Rob Fergus, “Blind deconvolution using a normalized sparsity measure,” in CVPR, 2011.
  • [7] Yilun Wang, Junfeng Yang, Wotao Yin, and Yin Zhang, “A new alternating minimization algorithm for total variation image reconstruction,” SIAM J. Imaging Sciences, vol. 1, no. 3, pp. 248–272, 2008.
  • [8] Amir Beck and Marc Teboulle, “A fast iterative shrinkage-thresholding algorithm for linear inverse problems,” SIAM J. Imaging Sciences, vol. 2, no. 1, pp. 183–202, 2009.
  • [9] Oliver Whyte, Josef Sivic, Andrew Zisserman, and Jean Ponce, “Non-uniform deblurring for shaken images,” in CVPR, 2010.
  • [10] Ankit Gupta, Neel Joshi, C. Lawrence Zitnick, Michael Cohen, and Brian Curless, “Single image deblurring using motion density functions,” in ECCV, 2010.
  • [11] Stefan Harmeling, Michael Hirsch, and Bernhard Schölkopf, “Space-variant single-image blind deconvolution for removing camera shake,” in NIPS, 2010.
  • [12] Michael Hirsch, Christian J. Schuler, Stefan Harmeling, and Bernhard Schölkopf, “Fast removal of non-uniform camera shake,” in ICCV, 2011.
  • [13] Zhe Hu and Ming-Hsuan Yang, “Fast non-uniform deblurring using constrained camera pose subspace,” in BMVC, 2012.
  • [14] Sunghyun Cho, Hojin Cho, Yu-Wing Tai, and Seungyong Lee, “Registration based non-uniform motion deblurring,” Comput. Graph. Forum, vol. 31, no. 7-2, pp. 2183–2192, 2012.
  • [15] Li Xu and Jiaya Jia, “Depth-aware motion deblurring,” in ICCP, 2012.
  • [16] Michal Sorel and Filip Sroubek, Image Restoration: Fundamentals and Advances, CRC Press, 2012.
  • [17] Hui Ji and Kang Wang, “A two-stage approach to blind spatially-varying motion deblurring,” in CVPR, 2012.
  • [18] Yu-Wing Tai, Ping Tan, and Michael S. Brown, “Richardson-Lucy deblurring for scenes under a projective motion path,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 33, no. 8, pp. 1603–1618, 2011.
  • [19] Neel Joshi, Sing Bing Kang, C. Lawrence Zitnick, and Richard Szeliski, “Image deblurring using inertial measurement sensors,” in ACM SIGGRAPH, 2010.
  • [20] Steven M. Seitz and Simon Baker, “Filter flow,” in ICCV, 2009.
  • [21] Michael Hirsch, Suvrit Sra, Bernhard Schölkopf, and Stefan Harmeling, “Efficient filter flow for space-variant multiframe blind deconvolution,” in CVPR, 2010.
  • [22] James G. Nagy and Dianne P. O’Leary, “Restoring images degraded by spatially variant blur,” SIAM J. Sci. Comput., vol. 19, no. 4, pp. 1063–1082, 1998.
  • [23] Anat Levin, Yair Weiss, Frédo Durand, and William T. Freeman, “Efficient marginal likelihood optimization in blind deconvolution,” in CVPR, 2011.
  • [24] J. A. Palmer, D. P. Wipf, K. Kreutz-Delgado, and B. D. Rao, “Variational EM algorithms for non-Gaussian latent variable models,” in NIPS, 2006.
  • [25] D. P. Wipf, B. D. Rao, and S. S. Nagarajan, “Latent variable Bayesian models for promoting sparsity,” IEEE Trans. Information Theory, vol. 57, no. 9, pp. 6236–6255, 2011.
  • [26] Alan L. Yuille and Anand Rangarajan, “The concave-convex procedure (CCCP),” in NIPS, 2001, pp. 1033–1040.
  • [27] J. A. Palmer, “Relatve convexity,” Technical report, UCSD, 2003.
  • [28] Bhaskar D. Rao, Kjersti Engan, Shane F. Cotter, Jason A. Palmer, and Kenneth Kreutz-Delgado, “Subset selection in noise based on diversity measure minimization,” IEEE Trans. Signal Processing, vol. 51, no. 3, pp. 760–770, 2003.
  • [29] A. Levin, R. Fergus, F. Durand, and W. T. Freeman, “Deconvolution using natural image priors,” Tech. Rep., MIT, 2007.
  • [30] Dilip Krishnan and Rob Fergus, “Fast image deconvolution using hyper-Laplacian priors,” in NIPS, 2009.
  • [31] Taeg Sang Cho, C. Lawrence Zitnick, Neel Joshi, Sing Bing Kang, Rick Szeliski, and William T. Freeman, “Image restoration by matching gradient distributions,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 34, no. 4, pp. 683–694, 2012.
  • [32] Stephen Boyd and Lieven Vandenberghe, Convex Optimization, Cambridge University Press, Cambridge, UK, 2004.