Understanding and Eliminating the Large-kernel Effect in Blind Deconvolution

06/06/2017 ∙ by Li Si-Yao, et al. ∙ Beijing Normal University 0

Blind deconvolution consists of recovering a clear version of an observed blurry image without specific knowledge of the degradation kernel. The kernel size, however, is a required hyper-parameter that defines the range of the support domain. In this study, we experimentally and theoretically show how large kernel sizes introduce noises to expected zeros in the kernel and yield inferior results. We explain this effect by demonstrating that sizeable kernels lower the squares cost in optimization. We also prove that this effect persists with a probability of one for noisy images. Using 1D simulation, we quantify the increment of error of estimated kernel with its size. To eliminate this effect, we propose a low-rank based penalty that reflects structural information of the kernel. Compared to the generic ℓ_α, our penalty can respond to even a small amount of random noise in the kernel. Our regularization reduces the noise and efficiently enhances the success rate of large kernel sizes. We also compare our method to state-of-art approaches and test it using real-world images.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 2

page 5

page 6

page 7

page 8

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Blind deconvolution is a fundamental problem in low level vision, and is always drawing research attentions [22, 20, 15, 14, 21]. Given a blurry image , blind deconvolution aims to recover a clear version , in which it is crucial to first estimate blur kernel successfully. Formally, the degradation of image blur is modeled as

(1)

where and are with size , is with size , is the 2D convolution operator and is usually assumed as random Gaussian noises. Blind deconvolution needs to jointly estimate blur kernel and recover clear image .

The most successful blind deconvolution methods are based on the maximum-a-posterior (MAP) framework. MAP tries to jointly estimate and by maximizing the posterior , which can be further reformulated as an optimization on regularized least squares [2],

(2)

where and are prior functions designed to prefer a sharp image and an ideal kernel, respectively. It is not trivial to solve the optimization problem in Eqn. (2), and instead it is usually addressed as alternate steps,

(3)

and

(4)

In the most blind deconvolution methods, kernel size is hyper-parameters that should be manually set. An ideal choice is the ground truth size to constrain the support domain, which however is not available in practical applications, requiring hand-crafted tuning.

On one hand, a smaller kernel size than ground truth cannot provide enough support domain for estimated blur kernel. Therefore, kernel size in the existing methods is usually pre-defined as a large value to guarantee support domain.

truth size = 23 size = 23, err = 1.9 size = 47, err = 5.9 size = 69, err = 69.9
(a) (b) (c) (d) (e)

Figure 1: Large kernels produce inferior results. (a) Numerical errors with kernel size. (b) Blurry image and ground truth kernel. (c-e) Deblurred results. In the first row of (c-e) are restored images and corresponding estimated kernels; in the second row are support domains (), where adjacent positive pixels are colored identically and zeros are white. In this experiment, we omitted regularization ; hence, k-step equals to a bare least squares optimization. We also avoided using multi-scaling scheme and threshold in this experiment. Parameters that performed well on the truth size were kept identical for larger sizes during the experiment.

On the other hand, as shown in Figure 1, oversized kernels are very likely to introduce estimation errors, and hence lead to unreasonable results. Hereby, we name this phenomenon larger-kernel effect. This interesting fact was first mentioned by Fergus  [9]. Then Cho and Lee [4] showed a similar result that the residual cost of (2) increases with over-estimated kernel size. However, such annoying phenomenon was not well analyzed and studied yet. Note that most MAP-based blind deconvolution algorithms adopt the trial-and-error strategy to tune kernel size, so the larger-kernel effect is a very common problem.

In this paper, we first explore the mechanism of larger-kernel effect and then propose a novel low rank-based regularization to relieve this adverse effect. Theoretically, we analyze the mechanism to introduce kernel estimation error in oversized kernel size. Specifically, we reformulate convolution of (3) and (4) to affine transformations and analyze their properties on kernel size. We show that for in sparse distributions, this larger-kernel effect remains with probability one. We also conduct simulation experiments to show that kernel error is expected to increase with kernel size even without noise . Furthermore, we attempt to find out a proper regularization to suppress noise in large kernels. By exploiting the low rank property of blur kernels, we propose a low-rank regularization to reduce noises in , suppressing larger-kernel effect. Experimental results on both synthetic and real blurry images validate the effectiveness of the proposed method, and show its robustness to against over-estimated kernel size. Our contributions are two-folds:

  • We give a thorough analysis to mechanism of the phenomenon that over-estimated kernel size yields inferior results in blind deconvolution, on which little research attention has been paid.

  • We propose a low rank-based regularization to effectively suppress larger-kernel effect along with efficient optimization algorithm, and performs favorably on oversized blur kernel than state-of-the-arts.

2 Larger-kernel effect

In this section, we describe the larger-kernel effect in detail and provide a mathematical explanation.

2.1 Phenomenon

In Figure  1(b-c), it has shown that the larger the kernel size would lead to more inferior deblurring results, since the estimated blur kernel with larger support domain is very likely to introduce noises and estimation errors. Figure  1(a) shows both the error ratio (err) [17] of restored images and the Summed Squared Difference (SSD) of estimated kernels reach the lowest at the truth size and increase afterwards.

2.2 Mechanism

To analyze the source of larger-kernel effect, we firstly introduce an interesting fact that we call inflating effect.

Claim 1.

(Inflating Effect) Let , where . Let , where and

. Given an m-D random vector

whose elements are i.i.d. with the continuous probability density function p, for

Proof.

where .

For , we have . Hence, the Lebesgue measure of is zero, and the probability is zero. ∎

Claim 1 shows that padding linear independent columns to a thin matrix leads to a different least squares solution with lower residue squared cost.

The convolution part in (1

) is equivalent to linear transforms:

(5)

where italic letters , and represent column-wise expanded vectors of 2D , and , respectively; and are blocked banded Toeplitz matrices [1, 11]; and

are required to be odd.

We attribute the larger-kernel effect to either substep (3) or (4). On one hand, remains identical when and increase by wrapping a layer of zeros around and the result of x-step keeps the same. Hence, x-step should not be blamed as the source of the larger-kernel effect. On the other hand, when is larger, will become inflated for the same . In 1D cases, where , assume , then

(6)

During blind deconvolution iterations, for identical values of , a larger introduces more columns onto both sizes of and results in different solutions. To illustrate this point, we tested a 1D version of blind deconvolution without kernel regularization and took different values of (truth and double and four times the truth size) for the 50th k-step optimization after 49 truth-size iterations (see Figure  2). Figure  2(a-c) show that the optimal solutions in different sizes differ slightly on the main body that lies within the ground truth size (colored in red), but greatly outside this range (colored in green) where zeros are expected. Figure  2(d-f) compare ground truth to estimated kernels in (a-c) after non-negativity and sum-to-one projections. Larger sizes yield more positive noises; hence, they lower the weight of the main body after projections and change the outlook of estimated kernel.



(a) (d)

Estimated kernels


(b) (e)
(c) (f)
index index

Figure 2: Estimated kernels in 1D Blind deconvolution simulations. Left column are optimized kernels of different sizes after 50th iteration. Right column are corresponding normalized kernels of left after non-negativity and sum-to-one projections. In this experiment, is a 2551 vector extracted from a real image and the truth is generated by marginalizing a truth kernel from Levin’s dataset [17]. The signal prior is . This figure is recommended to view in color.

2.3 Probability of larger-kernel effect

Even if successfully iterates to truth , Claim 1 implicates the larger-kernel effect remains under the existence of random noise . We show

(7)

under which, the inflating effect holds for probability one in blind deconvolution.

Figure 3: Quantitative simulations show that error increases with kernel size. (a) The extracted row from a clear image. (b) Singular boundaries of and with sampled

. (c) Synthetic sparse signals. (d) The smallest, greatest and mean singular values of

.

Above all, we have

(8)

Kaltofen and Lobo [13] proved that for an M-by-M Toeplitz matrix composed of finite filed of elements,

(9)

Herein, clear images are statistically sparse on derivative fields [19, 27], and elements of are modeled to be continuous in hyper-Laplacian distributions [14]:

(10)

Then we get the following claim:

Claim 2.
Proof.

See supplementary file. ∎

To now, we have shown that for in sparse distribution, the inflating effect happens almost surely.

2.4 Quantification of error increment

Assume iterates to ground truth during iterations. Then, for estimated kernel , we have

(11)

where represents Moore-Penrose pseudo-inverse. Then,

(12)

Assume , then

(13)

where and represents the smallest and the greatest singular values, respectively.

The inflating effect implicates that a larger kernel size amplifies the error in due to noise . To quantify this increment, we extracted a line from a clear image in Levin’s set [17] as shown in Figure  3(a), and plotted and with increasing kernel size . We also generated normalized random Gaussian vectors and compared to simulated boundaries of singular values (see Figure  3(b)). The error in increases hyper-linearly with kernel size.

In practice, nuances are expected between and . Cho and Lee [4] indicated that should be regarded as a sparse approximation to , not the ground truth. Hence,

(14)

which yields implicit noise [25]. Assume , then,

(15)

and

(16)

Then,

(17)

To quantify how singular values of changes with kernel size, we simulated 100 times, in each of which we generated a stochastic sparse signal with length 254 under PDF in (10) with , and , and generated random Gaussian vector where . Figure  3(c) shows one example of generated and . Figure  3

(d) shows means and standard deviations of

, and , which is the average of singular values, of simulated on . The error of is expected to grow with kernel size even .

Figure 4: Singular values of clean kernels and noisy matrices. (a) The support domain (black) of a random positive half Gaussian noise matrix. (b) The distribution of singular values of (a). (c) costs of random Gaussian noise matrix (black), the truth kernel from [17] after zero-padding (red), and Gaussian PSF with standard deviation (blue) on kernel size. (d) Scaled (maximum to 1) singular value distributions of clean, impure and regularized kernels.

3 Low-rank regularization

Blind deconvolution is an ill-posed problem for lacking sufficient information. Without regularization, MAP degrades to Maximum Likelihood (ML), which yields infinite solutions [17]. As prior information, kernel regularization should be designed to compensate the shortage of ML and to guide the optimization to expected results. Great amount of studies focus on image regularization to describe natural images, , Total Variation (TV-[16, 26, 23], hyper-Laplacian [14], dictionary sparsity [30, 12], patch-based low rank prior [24], non-local similarity [5] and deep discriminative prior [18].

Unfortunately, kernel optimization doesn’t attract much attention of the literature. Previous works adopted various kernel regularizations, e.g., -norm [28, 10, 3, 29, 21], -norm [15, 25, 20] and -norm  [31], which, however, generally treated kernel regularization as an accessory and lacked a detailed discussion.

The larger-kernel effect is yielded by noise in ultra-sized kernels. Figure  1 and Figure  2 show that without kernel regularization, the main bodies of estimated kernels can emerge clearly, but increasing noises take greater amounts when is larger. To constrain to be clean, regularization is expected to distinguish noise from ideal kernels efficiently.

To suppress the noise in estimated kernels, we take low-rank regularization on such that k-step (4) becomes

(18)

Because the direct rank optimization is an NP-hard problem, continuous proximal functions are required. Fazel  [8] proposed

(19)

as a heuristic proxy for

where

is the N-by-N identity matrix and

is a small positive number.

To allow this approximation to play a role in general matrices, the low-rank object is substituted to  [6]. The regularization function then becomes

(20)

where is the -th singular value of .

Taking low-rank regularization on kernels is motivated by a generic phenomenon of noise matrices [1]. Figure  4(a-b) shows a non-negative Gaussian noise matrix and its singular values in decreasing order. For a noise matrix, where light and darkness alternate irregularly, the distribution of singular values decays sharply at lower indices; then, it breaks and drag a relatively long and flat tail to the last. In contrast, ideal kernels respond much lower to regularization (see Figure  4(c)). Based on this fact, noise matrices are distinguished by high cost from real kernels. Figure  4(d) shows that singular values of a low-rank regularized kernel are distributed similarly as the ground truth, compared with the impure one.

One intelligible explanation on the low-rank property of ideal kernels is the continuity of blur motions. Rank of a matrix equals the number of independent rows or columns; it reversely reflects how similar these rows or columns are. Speed of a camera motion is deemed to be continuous [7]. Hence, the local trajectory of a blur kernel emerges similar to neighbor pixels, which is measured in a low value by the continuous proxy of rank.

Figure 5: Comparison on respond to noise. The cost ratio is calculated as . This figure is recommended to view in color.

Compared to previous norms, low-rank regularization responds more efficiently to noise. To illustrate this point, we generated a noisy kernel by adding a small percentage () of non-negative Gaussian noise and of the real kernel. Figure  5 shows that the low-rank cost rapidly adjust favorably to the noise but norms fail. That is because only takes statistical information. An extreme example consists of disrupting a truth kernel and randomly reorganizing its elements, with cost unchanged. In contrast, rank (singular values) corresponds to structural information.

4 Optimization

Function is non-convex (and it is actually concave on ). To solve the low-rank regularized least squares  (4), we introduce an auxiliary variable and reformulate the optimization into

(21)
s.t.

Using the Lagrange method, (21) is solved by two alternate sub-optimizations

(22)

where is the iteration number while and are trade-off parameters.

The -substep is convex and accomplished using the Conjugate Gradient (CG) method. For -substep, low rank is adopted with limit; otherwise, the regularization may change the main body of kernel—an extreme result is . Thus, our strategy is to lower the rank at locally. Using the first-order Taylor expansion of at fixed matrix :

(23)

where is the

-th eigenvalue of

, the k-substep in (22) is transformed into an iterative optimization

(24)

where is the inner iteration number. For convenience, we set as a flag (if , the k-substep will be skipped) and only tuned as the trade-off parameter.

Define the proximal mapping of function as follows:

(25)

Dong  [6] proved that one solution to the proximal mapping of is

(26)

where is SVD of , and . Local low-rank optimization is implemented as iterations via the given parameter (see Algorithm 1). In our implementation, is designed to exponentially grow with to allow more freedom of for early iterations.

Overall Implementation. We took deconvolution sechme in [15] where (but with small modification) and applied non-blind deconvolution method proposed in [14].

1:, , , , , ,
2:
3:for  to  do
4:     if  then
5:          using CG with maximum iterations
6:     else
7:         
8:          using CG with iterations
9:     end if
10:     Initializing with all singular values equal to 1
11:     for  to  do
12:         
13:     end for
14:     
15:     
16:end for
17:
Algorithm 1 Updating k with low-rank regularization
1:blurry image , kernel size , , , ,
2:clear image and degradation kernel
3:
4:Initialize
5:Initialize with an zero matrix adding [0.5 0.5] in the center
6:for  to  do
7:     Update using Algorithm 3 in [15]
8:     Update using Algorithm 1
9:end for
10: Non-blind deconvolution
Algorithm 2 Blind Deconvolution (single-scaling version)

5 Experimental Results

In this section, we first discuss the effects of low rank-based regularization, then evaluate the proposed method on benchmark datasets, and finally demonstrate its effectiveness on real-world blurry images. The source code is available at https://github.com/lisiyaoATbnu/low_rank_kernel.

size=23, err=1.55 size=47, err=1.56 size=69, err=2.14

Figure 6: Deblurring results using low-rank regularization.
Blurry Low rank None

Figure 7: Comparison of different kernel priors on real-world images. Large kernel size is . It’s recommended to zoom in.
(a) (b)

Figure 8: Success rates on truth (a) and double (b) sizes.

5.1 Effects of low rank-based regularization

Corresponding to high error ratios of large kernels in Figure  1, we repeat the experiment using same parameters except and . Figure  6 shows low-rank regularized kernels are much more robust to kernel size. Noises in kernels are efficiently reduced and qualities of restored images are enhanced. We further verify it on real-world images by imposing different regularization terms. As in Figure  7, blur kernels with low-rank regularization have less noises, while the others suffer from strong noises, yielding artifacts in the deblurring images. We note that in experiments of Figure  6 and Figure  7, we deliberately omitted multi-scaling scheme to expose the effectiveness of low-rank regularization itself.

5.2 Evaluation on synthetic dataset

The proposed method is quantitatively evaluated on dataset from [17]. Figure  8 shows the success rates of state-of-the-art methods versus our implementations with and without (set and zero) low-rank regularization. The average PSNRs in Figure  8 with different sizes are compared in Table 1. Parameters are fixed during the whole experiment: , , , , and ; a 7-layer multi-scaling pyramid is taken. Kernel elements smaller than 1/20 of the maximum are cut to zero, which is also taken in [3, 14]. Low-rank regularization works more effectively than the regularization-free implementation and the state-of-art.

Method prior truth size double size
[22] 27.34 23.29
[3] 26.85 25.74
[28] 26.91 26.71
[25] 26.54 26.44
[15] 25.34 23.95
[31] 26.58 26.83
26.68 23.85
27.36 27.47

Table 1: Average PSNRs (dB) with truth and double sizes in experiments of Figure  8.
Blurry  [3]  [3]  [28]
 [28]  [15]  [15]  [31]
 [31] Low rank (ours) None (ours) Low rank (ours)
Figure 9: Test on real-world image roma. Each domain (positive parts) of estimated kernel is displayed at the bottom right corner of corresponding restored image.
Blurry , 1/20 max threshold [3] , heuristic domain detector [28]

 [15], 1/20 max threshold , none [31] Low rank, 1/20 max threshold

Figure 10: Test on real-world image postcard. Kernel regularizations are listed under restored images.

5.3 Evaluation on real-world blurry images

We compared our implementation to state-of-the-art methods on real-world images to reveal the robustness of low rank regularization on large kernel size. Specifically, [28] takes a heuristic iterative support domain detector based on the differences of elements of , which is regarded to be more effective than 1/20 threshold. Figure  9 shows that size yields strong noises in estimated kernels of previous works [3, 28], and even changes main bodies of kernels [15, 31]. In contrast, low rank regularization can keep the kernel relatively stable for the larger size. One more comparison of different regularizations and refinement methods on large kernel size are shown in Figure 10. As for computational efficiency of our method, it takes about 85s on a Lenovo ThinkCentre computer with Core i7 processor to process images with size .

6 Conclusion

In this paper, we demonstrate that over-estimated kernel sizes produce increased noises in estimated kernel. We attribute the larger-kernel effect to the inflating effect. To reduce this effect, we propose a low-rank based regularization on kernel, which could suppress noise while remaining restored main body of optimized kernel.

The success of blind deconvolution is contributed by many aspects. In practical implementations, even for noise-free , the intermediate is unlikely to iterate to ground truth, hence some parts of will be treated as implicit noises, which may intensify the effect even more than expected and require future researches.

Acknowledgement

This work is supported by the grants from the National Natural Science Foundation of China (61472043) and the National Key R&D program of China (2017YFC1502505). We thank Ping Guo for constructive conversation. Qian Yin is the corresponding author.

References

  • [1] H. C. Andrews and B. R. Hunt. Digital image restoration, chapter 5.2, pages 102–103. Prentice-Hall, Englewood Cliffs, NJ, 1977.
  • [2] T. F. Chan and C.-K. Wong. Total variation blind deconvolution. IEEE Trans. Image Process., 7(3):370–375, 1998.
  • [3] S. Cho and S. Lee. Fast motion deblurring. ACM Trans. Graph., 28(5):145, 2009.
  • [4] S. Cho and S. Lee. Convergence analysis of map based blur kernel estimation. arXiv preprint arXiv:1611.07752, 2016.
  • [5] W. Dong, G. Shi, and X. Li.

    Nonlocal image restoration with bilateral variance estimation: a low-rank approach.

    IEEE Transactions on Image Processing, 22(2):700–711, 2013.
  • [6] W. Dong, G. Shi, X. Li, Y. Ma, and F. Huang. Compressive sensing via nonlocal low-rank regularization. IEEE Trans. Image Process., 23(8):3618–3632, 2014.
  • [7] L. Fang, H. Liu, F. Wu, X. Sun, and H. Li. Separable kernel for image deblurring. In CVPR, pages 2885–2892. IEEE, 2014.
  • [8] M. Fazel, H. Hindi, and S. P. Boyd. Log-det heuristic for matrix rank minimization with applications to hankel and euclidean distance matrices. In American Control Conf. (ACC), volume 3, pages 2156–2162, 2003.
  • [9] R. Fergus, B. Singh, A. Hertzmann, S. T. Roweis, and W. T. Freeman. Removing camera shake from a single photograph. In ACM Trans. Graph., volume 25, pages 787–794, 2006.
  • [10] D. Gong, M. Tan, Y. Zhang, A. Van den Hengel, and Q. Shi. Blind image deconvolution by automatic gradient activation. In CVPR, pages 1827–1836, 2016.
  • [11] R. M. Gray. Toeplitz and circulant matrices: A review. Foundations and Trends in Communication and Information Theory, 2(3):155–239, 2006.
  • [12] Z. Hu, J.-B. Huang, and M.-H. Yang. Single image deblurring with adaptive dictionary learning. In ICIP, pages 1169–1172. IEEE, 2010.
  • [13] E. Kaltofen and A. Lobo. On rank properties of toeplitz matrices over finite fields. In Int. Symp. Symbolic and Algebraic Computation (ISSAC), pages 241–249, 1996.
  • [14] D. Krishnan and R. Fergus. Fast image deconvolution using hyper-laplacian priors. In NIPS, pages 1033–1041, 2009.
  • [15] D. Krishnan, T. Tay, and R. Fergus. Blind deconvolution using a normalized sparsity measure. In CVPR, pages 233–240, 2011.
  • [16] A. Levin, R. Fergus, F. Durand, and W. T. Freeman. Image and depth from a conventional camera with a coded aperture. ACM Trans. Graph., 26(3):70, 2007.
  • [17] A. Levin, Y. Weiss, F. Durand, and W. T. Freeman. Understanding and evaluating blind deconvolution algorithms. In CVPR, pages 1964–1971, 2009.
  • [18] L. Li, J. Pan, W.-S. Lai, C. Gao, N. Sang, and M.-H. Yang. Learning a discriminative prior for blind image deblurring. In CVPR, pages 6616–6625. IEEE, 2018.
  • [19] B. A. Olshausen and D. J. Field. Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature, 381(6583):607, 1996.
  • [20] J. Pan, Z. Lin, Z. Su, and M.-H. Yang.

    Robust kernel estimation with outliers handling for image deblurring.

    In CVPR, pages 2800–2808, 2016.
  • [21] J. Pan, D. Sun, H. Pfister, and M.-H. Yang. Blind image deblurring using dark channel prior. In CVPR, pages 1628–1636, 2016.
  • [22] D. Perrone and P. Favaro. Total variation blind deconvolution: The devil is in the details. In CVPR, pages 2909–2916, 2014.
  • [23] D. Ren, H. Zhang, D. Zhang, and W. Zuo. Fast total-variation based image restoration based on derivative augmented lagrangian method. Neurocomputing, 2015.
  • [24] W. Ren, X. Cao, J. Pan, X. Guo, W. Zuo, and M.-H. Yang. Image deblurring via enhanced low-rank prior. IEEE Transactions on Image Processing, 25(7):3426–3437, 2016.
  • [25] Q. Shan, J. Jia, and A. Agarwala. High-quality motion deblurring from a single image. ACM Trans. Graph., 27(3):73, 2008.
  • [26] Y. Wang, J. Yang, W. Yin, and Y. Zhang. A new alternating minimization algorithm for total variation image reconstruction. SIAM Journal on Imaging Sciences, 1(3):248–272, 2008.
  • [27] Y. Weiss and W. T. Freeman. What makes a good model of natural images? In CVPR, pages 1–8, 2007.
  • [28] L. Xu and J. Jia. Two-phase kernel estimation for robust motion deblurring. In ECCV, pages 157–170, 2010.
  • [29] L. Xu, S. Zheng, and J. Jia. Unnatural l0 sparse representation for natural image deblurring. In CVPR, pages 1107–1114, 2013.
  • [30] X. Zhang, M. Burger, X. Bresson, and S. Osher. Bregmanized nonlocal regularization for deconvolution and sparse reconstruction. SIAM Journal on Imaging Sciences, 3(3):253–276, 2010.
  • [31] W. Zuo, D. Ren, S. Gu, L. Lin, and L. Zhang. Discriminative learning of iteration-wise priors for blind deconvolution. In CVPR, pages 3232–3240, 2015.

Supplementary File: Proof to Theorem 2

Assume to be odd and (). Then,

(27)
(28)

For any -by- matrix A, i.f.f. . Thus,

(29)

As we know, the explicit formula of determinant of a Toeplitz matrix on its elements is unsolved in the current literature. Li [1] gives a concrete expression of by using LU factorization but fails to fit all situations ( when ). However, it can be shown that equals a multivariate polynomial function without manipulating the whole expression. By using Laplace expansion on , the item of largest degree is with factor 1.

Lemma.

Let be a continuous r.v. in the finite support domain [a, b]. Let be a polynomial function

where is a finite polynomial function with the largest degree less than . Generate a new r.v.

Then, for

, the Cumulative Distribution Function (CDF)

is continuous at y.

Proof.

where .

For ,

and

where .

Based on Beppo Levi’s Theorem,

Because ( is a constant), for , zeros of are finite, hence the Lebesgue measure of is zero. We have

Thus

Theorem 2.

Let be a continuous r.v. with PDF

For a sample of independent observations , generate a new r.v.

Then,

Proof.

Based on the Law of Total Probability and Dominated Convergence Theorem,

where

is a polynomial function with the largest degree less than . Based on Lemma, we have

Hence,

References

  • [1] H. Li. On calculating the determinants of toeplitz matrices.J. Appl. Math. Bioinformatics, 1(1):55, 2011.