## 1 Introduction

Computer aided automatic defect detection have been successfully used in the industry to ensure the quantity of the product and allow the timely maintenance. Along with the development of unmanned vehicle, such as drone, more images are taken under imperfect setting. The resulting images could contain rough surface or could be blurred by the motion in the image capture process, which leads to serious over identification of the abnormals. The problem directly impedes the development of the outdoor defect detection, such as hail damages detection, especially in the house roof industry where the defects occur on the rough roof shingles and the over identification could significantly level up the costs of maintenance.

To label and extract the defect automatically, one useful tool is the edge detection, which identifies the points at the locations of significant local intensity changes [1]. The commonly used edge detectors include the gradient based Sobel [2], Roberts [3], Prewitt [4] detectors, the second derivative Laplacian of a Gaussian detector [5], and the Canny detector [6]. Because of the multi-stage improvements, the Canny detector often performs better than the others [7]. As an application, we use the Canny procedure [6] to search the edge surrounding the defects on the roof shingles. Note that the images taken in motion are inevitably blurred due to the unstable movements in the image capture process. These motion blurs exacerbate the appearances of the surfaces, and in turn increase the difficulties of suppressing the non-edge pixels. To see that, we implement the Canny detector on the three blurred non-defect shingles. The right panel in Figure 2 shows that the Canny detector is substantially misled by the motion blurs, while the proposed Bayesian detector (will be discussed later) is robust in removing the non-edge pixels for the blurred images.

Images | Bayesian detector | Canny detector |
---|---|---|

is the Gaussian smoothing standard deviation.

Controlling the false discovery rate is particularly important for the drone image analysis. Because the drones take pictures in motion, some images are inevitably blurred due to the unstable movements in the roof inspection process. These motion blurs exacerbate the appearances of the surfaces, and in turn increase the difficulties of suppressing the non-edge pixels. To see that, we implement the Canny detector on the three blurred non-damage shingles. The right panel in Figure 2 shows that the Canny detector is substantially misled by the motion blurs, while the proposed Bayesian detector (will be discussed later) is robust in removing the non-edge pixels for the blurred images. It is worth mentioning that for the insurance companies, the false discovery is a more serious problem than the false non-discovery of the damages, because no insurance company is willing to cover the costs for “fixing” no damage areas.

Images | Bayesian detector | Canny detector |
---|---|---|

To explore the reason of the failure cases, we rephrase the edge detection to a hypothesis testing problem, where at the

th pixel, the null hypothesis is

and the alternative hypothesis is or . Here and are the means of the partial difference on the horizontal and verticaldirections, respectively. The Canny method uses the sample mean difference as the test statistics for evaluating the hypothesis. However, this standard frequentist test does not lead to the conclusion of accepting the null hypothesis, and hence it is expected that the Canny method is ineffective in eliminating the non-edge pixels. In contrast, the Bayesian hypothesis test results in the calculation of the posterior probability that the null hypothesis is true. Therefore, we resort to the Bayesian methods to determine whether the specific pixel is a non-edge point.

Under the Bayesian paradigm, the Bayes factor is commonly used to test whether the null hypothesis is true. Commonly, the performance of the Bayes factors heavily relies on the proper specifications of the prior distributions, particularly Bayes factors require the priors to be proper (integration of the prior is one). To this end, various prior distributions have been proposed to facilitate computation of the Bayes factors, including local priors [8, 9, 10, 11, 12]; the fractional Bayes priors [13, 14, 15, 8]; and the intrinsic priors [16, 17, 18, 19]. These priors assign non-negligible probabilities to regions consistent with the null hypothesis, resulting in the asymmetric accumulation of evidence in favor of the true alternative and the true null hypothesis. To address this issue, [20] and [21] proposed non-local priors, which assign zero densities to the regions corresponding to the null hypothesis. [22] further introduced the non-local alterative priors, which are flexible in specifying the rate at which the prior approaches to 0. These non-local priors balance the rates of the convergence for the Bayes factors in favor of the true null and alternative hypotheses. [23] later applied these non-local priors to the model selection in high-dimensional settings. In addition, [24]

recently considered parameter estimation with the non-local priors in the high-dimensional settings.

The non-local moment and inverse moment priors

[22] have been successfully implemented in the hypothesis testing, model selection, and variable selection contexts [23, 25]. These two seemingly unrelated priors share the same property that the densities approach to zero at the origin. Realizing this connection, we propose a general framework in constructing the no-local priors, namely the reflected non-local priors. The unified framework is more interpretable and accommodates both the moment and inverse moment priors. Further, the convergence of the Bayes factor is fully depicted by a single parameter, which yields a systematic routine to adjust the Bayes factors.The reflected non-local prior has wide applications in the hypothesis testing and variable selection. As discussed earlier, when the null hypothesis is true, the convergence rate for the Bayes factor to largely depends on the choice of the prior. When the local, non-local moment and inverse moment prior are selected, the Bayes factors converge to 0 at the rates of , , and , respectively, with the sample size and the prior density parameters [22]. It is clear that the non-local priors have faster convergence rates. Furthermore, with proper choices of parameters, the reflected non-local priors achieve the same convergence rates as the moment and the inverse moment priors. In addition to the merits in the tuning and interpretation, the reflected non-local prior is a superior choice in the hail edge detection problem.

The rest of the article is organized as the following. We introduce the reflected non-local prior in the general setting, derive their asymptotic properties, and investigate their finite sample properties in Section 2. In Section 3, we investigate their application in the edge detection through simulation and hail damage detection data analysis. We conclude with some discussion in Section 5. The theoretical derivations are presented in the Appendix.

## 2 Reflected non-local priors

### 2.1 A truncated reflected non-local prior

Let denote a

-dimensional random variable with the likelihood

where is the density function of and is the parameter of interest, . Under the Bayesian paradigm, we define the null and alternative hypotheses as

where and are the priors of the parameter under the null and alternative hypotheses, respectively. The marginal density with the prior is given by

for . The Bayes factor based on a sample of size is defined as

(1) |

If for every , there is such that for all with [22], then is named a non-local prior. For ease of exposition, we discuss the setting when ; that is, is a scalar, denoted by . We focus on testing a point null hypothesis with the null prior , where is a Dirac measure. Further, we specify the prior under the alternative hypothesis to be a non-local prior.

To motivate our non-local prior, first note that if we flip a bounded density upside down, e.g., a zero-mean normal distribution, after proper normalization, we can obtain a density with

mass at the maximum point in the original density. For example, if we consider a standard normal density , then is a non-local prior for as it places zero mass at the origin, where is the normalizing constant. Considering a generalized normal density, we definewhich is the kernel of a generalized normal density. When , it reduces to the kernel of the normal density with mean

and variance

. As a result, we define the reflected non-local prior as(2) | |||||

where .

It is worth mentioning that by the Taylor expansion, we can write

When , the summand is the main part of [22]’s moment prior, which forces the prior to be 0 at the hypothetical true value. This suggests that the reflected non-local prior would have similar performances as [22]’s moment priors when in the neighborhood of .

One major difference between the reflected non-local prior and [22]’s priors lies in the treatment of the tails of the density. [22] allow the prior density to go to 0 gradually, while we force the density to be 0 at the truncation endpoints. Note that the truncation is induced by , which does not affect the behavior of the density around . Therefore, if the parameters are unbounded, we can use a smooth function in replacing the indicator function. This yields a generalized reflected prior as discussed in the next section.

### 2.2 A generalized reflected prior

We generalize the definition of the reflected non-local prior so that it mimics the performance of the moment prior and inverse moment prior in [22]. To cover broader cases, we allow to be negative in the definition of the kernel; that is,

Hence, our generalized reflected non-local prior is

(3) | |||||

where is the normalizing constant. Note that leads to . Hence, although when , the normalizing constant helps to retain the positive sign of . Further, to allow various tail behaviors, instead of truncating the distribution, we use a generic function to force the density to be when . On the other hand, to preserve the property of the non-local prior, we require in the neighborhood of .

When is a

-dimensional vector, we define the multi-dimensional reflected non-local prior as

(4) | |||||

where is a normalizing constant, and is the generalized multivariate Gaussian density.

Following the same arguments as in (8) of [22], the Bayes factor in (1) converges to infinity at the rate of under the true alternative hypothesis. On the other hand, under the true null hypothesis, we establish its property as follows.

###### Theorem 1.

Consider testing the null hypothesis , versus the alternative hypothesis , where is the -dimensional generalized reflected local prior in (4). Under Condition LABEL:con:zero–LABEL:con:taubound, when the null hypothesis is true, if , then ; if , then for some .

From the theorem, we conclude that when and , the generalized reflected priors have the similar asymptotic behaviors as the moment and inverse moment priors, respectively [22]. Hence, the generalized reflected construction provides a unified way to define the non-local priors, which accommodates both the moment and inverse moment priors. Further, the simple structure facilitates the systematic investigation on their finite sample performances.

### 2.3 Finite sample properties

In Figure 3, we show the shapes of the univariate truncated and generalized reflected priors. As approaches the origin, the densities with starts the declination earlier than the ones with . This explains the phenomenon shown in Theorem 1 that when the reflected prior has a better convergence rate under the null hypothesis. Further, the decreasing rate increases with , which verifies the convergence order in Theorem 1. In addition, by introducing the smooth function , the tails of the density diminish to when deviates from the origin.

We further explore the weight of evidence, , through simulating samples from the normal distribution with mean and variance 1. We use the priors in (2) and (3) with to construct Bayes factors. On the left panel of Figure 4, we show that under the true null hypothesis that , decreases as the sample size increases. When , the decreasing rate is relatively slower than the settings with , but overall the decreasing is faster for larger values of . In the right pannel of Figure 4, we study for different values of under the true alternative. When is close to 0, is larger for the priors with . In addition, decreases with the increase of . However, for the priors with , the corresponding grows faster when is sufficiently large. Overall, by choosing a non-local prior with faster declination at the origin, we can substantially improve the convergence rate of the Bayes factor under the true null hypothesis. On the other hand, we limit the loss of the convergence within an acceptable range under the true alternative hypothesis.

## 3 Bayesian defect detector

Let be the smoothed image by the Gaussian filter, for testing whether the th pixel is on the defect edge, we collect sample differences on the -axis; that is,

and samples on the -axis; that is,

Further, let , and we assume that follows a bivariate normal distribution with mean and variance . Under this setting, the logarithm of the likelihood for and , denoted as . Note that the case of not on the edge corresponds to the null hypothesis , and on the edge corresponds to the alternative hypothesis , with dimension . Here is the prior distribution at the edge pixels. Then we can write the Bayes factor as

In the implementation, we let so that a total of 8 surrounding pixels are selected for each location . Further, we estimate by 2 times the sample variance of , , . After obtaining the Bayes factor for each pixel, we perform the non-maximum suppression and thresholding to thin the edges. In the non-maximum suppression procedure, we keep the pixels that have the maximum in its neighborhood as the potential edge points, and set the rest of pixels to be the smallest

values among all locations. Then we use the k-means procedure to split

to two clusters, and select the maximum value in the cluster with a lower average as the threshold . The edge pixels are the ones with . For comparison, we implement the standard non-maximum suppression, thresholding and edge tracking in the Canny procedure [6]. Note that the Canny detector use the statistics to detect edges, where are weight vectors.### 3.1 Simulation studies

We perform the simulation studies to evaluate the proposed method in detecting the edges under the settings with various signal noise ratios. The simulation samples are generated through adding the white noises with standard deviations 0.2, 0.5, 1, 1.2, 1.5 to a grayscale image. In all the simulations, we fix

and . We present the edge detection results along with the noised images in Figure 5 from a single simulation run. Figure 5 shows that the Bayes detector and the Canny detector have similar performances when the error variations are less than 0.5. However, with larger error variations, the Canny detector picks substantially amount of the non-edge pixels. On the contrary, Bayesian detector shows stable performance on eliminating the non-edge pixels.Contaminated Images | Bayesian detector | Canny detector |
---|---|---|

We further compare the Bayesian and Canny detector over the 100 simulation studies. Let be the set containing the selected edge pixels, and be the one containing the true edge pixels. The point is considered as an edge point if for at least one , where is the cutoff value to declare the two points are from different regions and is the norm. We define

and

Clearly, represents the proportion of the selected pixels being the true edge points, while represents the proportion of the true pixels being selected. The ideal case is . However, when the noise variation increases, there are tradeoffs between and . In Figure 6 and Figure 7, we present the box-plot based on the 100 simulations and . It is clear that the Bayesian detector outperforms the Canny detector in terms of the significant larger values. On the other side, to achieve a smaller false discovery rate, the Bayesian detector inevitably sacrifices the recovery of all the true edge pixels, leading to smaller values compared to the Canny detector. Nevertheless, it still yields on average, which is sufficient to describe the shape of the damage in practice.

## 4 Roof damage detection

### 4.1 Damage detection on the interiors of the shingles

Nowadays, the standard three-tab asphalt shingles are the most commonly used roof materials in the United States, because they are economical and easy to install. However, such roofing materials are not as durable and long lasting as some of the others such as metal, slate, or clay tile. Hence, they are the most vulnerable materials when facing the hail storms. In addition, the surfaces of the asphalt shingle roofs are typically rough, which brings more challenges to the edge detection in the hail damages. In all the analyses, we fix and standardize the smoothed image by the sample standard deviations.

To avoid the possible intensity distortion around the shingle joints, we first apply the proposed Bayesian detector to the edge detection on the interiors of asphalt shingles, and the comparison with the Canny detector is shown in Figure 8. It can be seen that the Bayesian detector is able to reduce the number of the falsely discovered edge pixels. We also apply the Bayesian detector to the slate roofs, as shown in Figure 9. Since the slate roofs are generally smoother than the asphalt shingle roofs, both the Canny detector and the Bayesian detector produce satisfactory results. With a closer look at the resulting edges, however, we see that the Bayesian detector still selects a smaller number of non-edge points.

Images | Bayesian detector | Canny detector |
---|---|---|

Images | Bayesian detector | Canny detector |
---|---|---|

In conclusion, for the asphalt shingle roofs, the Bayesian detector performs significantly better in controlling the false selection of non-edge points. In addition, the more noise are the images involved, the larger ’s are required to reduce the false discovery rates. Furthermore, for the smoother materials, such as the slate roofs, both the Bayesian detector and the Canny detector perform satisfactorily on the edge detection.

### 4.2 An integrated pipeline

Having demonstrated the superior performance of the Bayesian detector, we apply it to the more complicated settings, where the images contain complete shingles with/without damages and the shingle joints. We process the roof images through a complete pipeline starting from the edge detection, edge closing (remove small holes on edges), connected components labeling, to the convex hull fitting and damage extraction. The edge detection is the very first step in the process, which affects the outputs from the subsequent procedures.

In Figure 10, we show the edge detection and edge closing results from both the Bayesian and Canny detectors on a sampled image. The second row in Figure 10 shows that the Bayesian detector suppresses most non-edge points and provides precise edge regions. This leads to better edge closing results shown in third row, which in turn provides the clear separations between the shingles.

Image | |

Bayesian detector | Canny detector |

A better separation yields more precise labeling of connected components, and finally leads to a more accurate damage detection. This effect can be observed from Figure 11, where we show the extracted damage areas overlaid with the original images from two images. Following these results, the insurance companies decide to fix or change the roof shingles based on the severeness of the damages on their interiors. The judgement of the severeness varies across the insurance companies. In consultation with our collaborating roofing company, we provide some stylish examples based on our damage detection results. As suggested, the shingle must be replaced if more than 1/4 of the shingle are damaged. In the upper panel of Figure 11, both the Bayesian and Canny detectors suggest to fix the shingle on the middle and bottom-middle shingles, while the Bayesian detector suggests to fix, but the Canny detector suggests to change the bottom-left shingle. Furthermore, in the bottom panel, the Bayesian detector suggests to fix the upper-left and bottom-right shingles, keep the right shingle, and change the others, while the Canny detector suggests to change all the shingles besides the upper-left one. Compared with our observations on the real images, the conclusions from the Bayesian detector are much closer to the human decisions.

Image | Bayesian detector | Canny detector |
---|---|---|

## 5 Discussion

Edge detection is a classical problem in the computer vision area, which is dominated by the convolution based methods. We view the problem from the statistical perspective and creatively introduce the Bayesian tools to tackle the challenging problems in the field. The proposed Bayesian detector can successfully identify the edge points, while inducing smaller probabilities of the false discovery compared to the standard Canny detector. In addition, we introduce a general framework in constructing the non-local priors which generalizes the moment and inverse moment priors in

[22]. The reflected priors achieve the same asymptotic properties as the moment prior when and as the inverse moment prior when .The goal of the paper is to develop a robust method for extracting damages from the drone images. As a by-product, we introduce a class of new prior and discuss their applications in the image edge detection problem. Therefore, we do not discuss the optimal selection of the tuning parameters, including and in (4) and the number of the edge pixels. In fact, the parameter tuning process is a difficult task in the current situation, because there is no enough labeled training samples, and hence we are not able to verify the selection. However, in our simulation and real data analysis, the performance of the Bayesian detector is not sensitive to the changes in the tuning parameters. Although it is critical in the statistical learning procedure, we leave the selection of the tuning parameters to future works when sufficient samples are available for training.

## References

- [1] Ramesh Jain, Rangachar Kasturi, and Brian G Schunck, Machine vision, vol. 5, McGraw-Hill New York, 1995.
- [2] James Matthews, “An introduction to edge detection: The sobel edge detector,” 2002.
- [3] Lawrence G Roberts, Machine perception of three-dimensional solids, Ph.D. thesis, Massachusetts Institute of Technology, 1963.
- [4] J.M.S. Prewitt, “Object enhancement and extraction in picture processing psychopictorics,” in Picture Processing and Psychopictorics, B.S. Lipkin and al., Eds. Academic Press, New York, 1970.
- [5] Vincent Torre and Tomaso A Poggio, “On edge detection,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 8, no. 2, pp. 147–163, 1986.
- [6] John Canny, “A computational approach to edge detection,” IEEE Transactions on pattern analysis and machine intelligence, vol. 8, no. 6, pp. 679–698, 1986.
- [7] Saket Bhardwaj and Ajay Mittal, “A survey on various edge detector techniques,” Procedia Technology, vol. 4, pp. 220–226, 2012.
- [8] Fulvio De Santis and Fulvio Spezzaferri, “Consistent fractional bayes factor for nested normal linear models,” Journal of statistical planning and inference, vol. 97, no. 2, pp. 305–321, 2001.
- [9] Juan Antonio Cano, Mathieu Kessler, and Elías Moreno, “On intrinsic priors for nonnested models,” Test, vol. 13, no. 2, pp. 445–463, 2004.
- [10] Stephen G Walker, “Modern bayesian asymptotics,” Statistical Science, vol. 19, no. 1, pp. 111–117, 2004.
- [11] Elías Moreno, “Objective bayesian methods for one-sided testing,” Test, vol. 14, no. 1, pp. 181–198, 2005.
- [12] George Casella and Elías Moreno, “Objective bayesian variable selection,” Journal of the American Statistical Association, vol. 101, no. 473, pp. 157–167, 2012.
- [13] Anthony O’Hagan, “Fractional bayes factors for model comparison,” Journal of the Royal Statistical Society. Series B (Methodological), vol. 57, pp. 99–138, 1995.
- [14] Anthony O’Hagan, “Properties of intrinsic and fractional bayes factors,” Test, vol. 6, no. 1, pp. 101–118, 1997.
- [15] Caterina Conigliani and Anthony O’hagan, “Sensitivity of the fractional bayes factor to prior distributions,” Canadian Journal of Statistics, vol. 28, no. 2, pp. 343–352, 2000.
- [16] James O Berger and Luis R Pericchi, “The intrinsic bayes factor for model selection and prediction,” Journal of the American Statistical Association, vol. 91, no. 433, pp. 109–122, 1996.
- [17] James O Berger and Luis Raúl Pericchi, “Accurate and stable bayesian model selection: the median intrinsic bayes factor,” Sankhyā: The Indian Journal of Statistics, Series B, pp. 1–18, 1998.
- [18] James O Berger and Julia Mortera, “Default bayes factors for nonnested hypothesis testing,” Journal of the American Statistical Association, vol. 94, no. 446, pp. 542–554, 1999.
- [19] José M Pérez and James O Berger, “Expected-posterior prior distributions for model selection,” Biometrika, vol. 89, no. 3, pp. 491–512, 2002.
- [20] I Verdinelli and L Wasserman, “Bayes factors, nuisance parameters and imprecise tests,” Bayesian Statistics, vol. 5, pp. 765–771, 1996.
- [21] Judith Rousseau, “Approximating interval hypothesis: p-values and bayes factors,” Bayesian statistics, vol. 8, pp. 417–452, 2007.
- [22] Valen E Johnson and David Rossell, “On the use of non-local prior densities in bayesian hypothesis tests,” Journal of the Royal Statistical Society: Series B (Statistical Methodology), vol. 72, no. 2, pp. 143–170, 2010.
- [23] Valen E Johnson and David Rossell, “Bayesian model selection in high-dimensional settings,” Journal of the American Statistical Association, vol. 107, no. 498, pp. 649–660, 2012.
- [24] David Rossell and Donatello Telesca, “Nonlocal priors for high-dimensional estimation,” Journal of the American Statistical Association, vol. 112, no. 517, pp. 254–265, 2017.
- [25] Minsuk Shin, Anirban Bhattacharya, and Valen E Johnson, “Scalable bayesian variable selection using nonlocal prior densities in ultrahigh-dimensional settings,” arXiv:1507.07106, 2015.
- [26] AM Walker, “On the asymptotic behaviour of posterior distributions,” Journal of the Royal Statistical Society. Series B (Methodological), vol. 31, no. 1, pp. 80–88, 1969.