Robust Regression For Image Binarization Under Heavy Noises and Nonuniform Background

09/26/2016 ∙ by Garret Vo, et al. ∙ Florida State University 0

This paper presents a robust regression approach for image binarization under significant background variations and observation noises. The work is motivated by the need of identifying foreground regions in noisy microscopic image or degraded document images, where significant background variation and severe noise make an image binarization challenging. The proposed method first estimates the background of an input image, subtracts the estimated background from the input image, and apply a global thresholding to the subtracted outcome for achieving a binary image of foregrounds. A robust regression approach was proposed to estimate the background intensity surface with minimal effects of foreground intensities and noises, and a global threshold selector was proposed on the basis of a model selection criterion in a sparse regression. The proposed approach was validated using 26 test images and the corresponding ground truths, and the outcomes of the proposed work were compared with those from nine existing image binarization methods. The approach was also combined with three state-of-the-art morphological segmentation methods to show how the proposed approach can improve their image segmentation outcomes.



There are no comments yet.


page 3

page 20

page 21

page 23

page 24

page 25

page 28

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Image binarization is a problem of estimating the binary silhouette of foreground objects in a noisy image. A good image binarization solution has been useful in many different contexts such as optical character recognition lu2010document , image segmentation using binarization otsu1979threshold , and preliminary image segmentation prior to separating overlaps of foregrounds park2013segmentation . In particular, many existing morphological image segmentation algorithms park2013segmentation ; schmitt2009morphological ; zafari2015segmentation require the binary silhouette as inputs to find markers to individual foreground objects. The quality of their image segmentation is significantly affected by the accuracy of the binary silhouette input, so achieving a good binary silhouette estimate is essential. However, image binarization can be challenging under real-life situations such as uneven background and noises. This paper is concerned with an image binarization problem under uneven background and high image noises.

Our first motivating example is to detect nanoparticles in heavily noisy microscope images. A typical electron microscopic image has non-uniform background intensities due to the spatial variation of electron beam radiations (e.g. Fig. 1-(a)) or due to the transitions between different background materials (e.g. Fig. 1-(b)). In addition, microscope images typically contain significant noises. A simple and popular approach for detecting nanoparticles is morphological segmentation park2013segmentation ; schmitt2009morphological ; zafari2015segmentation ; malpica1997applying ; tek2005blood ; adiga2001efficient . The approach applies a series of morphological image operations to roughly locate or mark overlapping foreground objects in an input image, which guides the subsequent image segmentation based on watershed or other sophisticated methods. The first marking step takes the binary silhouette of foreground regions as an input, or the binary silhouette is internally generated by their built-in image thresholding units, which did not work very well under a varying image background and a low signal-to-noise ratio. With a good method for estimating the binary silhouette, the accuracy of many morphological image segmentation methods can be greatly improved.

The second motivating example is to detect typed texts or handwritten texts in document images. The image binarization is an important preliminary stage for document image analysis. Typical text document images have nonuniform background due to degradation of text documents as shown in Fig. 1-(c) and -(d). An effective solution of an image binarization problem under nonuniform background can be very useful to extract texts from document images.

(a) Microscope image with uneven illumination and medium noises
(b) Microscope image with background tranisition and mild noises
(c) OCR document image with uneven illumination and mild noises
(d) Handwritten text document image with document degradation
Figure 1: Example Images with Nonuniform Background

For binarizing an image with varying background and significant noise, we propose a two-stage approach. The first stage is to estimate the varying background of an input image, which is then subtracted from the input image. With successful background estimation, the subtraction result would have a flat background. The second stage is to select a global threshold to threshold the subtraction result to an binary image. We propose robust methods to estimate the background and the global threshold.

The remainder of this paper is organized as follows. Section 2 reviews the related research and discusses the contribution of the proposed approach. Section 3 presents our approach. Section 4 presents the numerical performance of the proposed approach using 26 test images, with comparison to nine state-of-the-art methods. Section 5 concludes this paper with discussion.

2 Related Research

The image thresholding or image binarization problem has been extensively studied in computer vision. The simplest approach selects and applies one

otsu1979threshold or two thresholds lai2014efficient

for an entire input image to classify individual pixels into foreground and background pixels. This is called the global thresholding approach. Depending on how threshold(s) are selected, the approach can be categorized into histogram shape-based, clustering-based, entropy-based, object attribute-based and spatial methods; several comprehensive reviews can be found in literature

sezgin2004survey ; stathis2008evaluation . The approach does not work well when an input image has background variations due to illumination effects or image degradation.

For images with nonuniform background, a local adaptive thresholding is more suitable. A popular approach for the local thresholding selects a threshold for each pixel, based on the local statistics within the neighborhood window centered at the pixel. Nikbalt niblack1985introduction used the local mean

and the local standard deviation

to choose the local threshold , where is a user-defined constant. A weakness of this local approach is that if foreground objects are sparse in an input image, a lot of background noises would remain in the outcome binary image gatos2006adaptive . To overcome this weakness, Sauvola sauvola2000adaptive modified the threshold to . Phansalskar phansalkar2011adaptive further modified it to deal with low contrast images.

Contrast threshold is also a popular local thresholding approach, which estimates and uses local image contrasts. The local image contrast of a pixel is often defined as the range of the intensities of the pixel’s neighborhood. If the contrast is high, the pixel is likely to belong to a transition area in between foregrounds and backgrounds, and it belongs to foregrounds or backgrounds otherwise. Therefore, local image contrasts are often thresholded, and low contrast regions are classified into foregrounds or backgrounds depending on some local statistics bernse1986dynamic . Su su2010binarization first computed a local contrast image using the local minimum and maximum within a local neighborhood window and then thresholded the contrast image to identify high contrast pixels. High contrast pixels were regarded as boundary pixels of foregrounds, and their intensities were used to determine local thresholds. Su su2013robust combined an image contrast map and an edge detector to detect high contrast edge pixels, and the mean and standard deviation of the edge pixel intensities were used to define local thresholds.

Background subtraction is another approach, which first estimates an image background intensity surface and then applies a global thresholding on the outcome of subtracting the background intensity surface from an input image. Gatos gatos2006adaptive

roughly split an input image into background and foreground pixels, and the background pixel intensities were interpolated to define a background intensity surface and the corresponding local threshold policy. The major weakness of this approach is that it is difficult to achieve a good initial separation of the background region when an input image has uneven background and heavy noises. Polynomial smoothing

mieloch2005dynamic is a background subtraction approach that does not require any rough estimates of foreground and background regions. Lu lu2010document applied the polynomial smoothing to directly estimate the background surface without any rough estimates of foreground and background regions, which was combined with an edge detection algorithm to determine local thresholds.

Edge-level thresholding was also popularly used for a local adaptive thresholding. Parker parker1991 ; parker1993thresholding first located the edge pixels of foregrounds using an existing edge detector and smoothly interpolated the edge pixels to build spatially varying thresholds. As a similar approach, Chen chen2008double used the Canny’s edge detector to extract edge pixels, and a region growing was applied with the edge pixels as seeds. Ramirez-Ortegon ramirez2010transition proposed the generalization of an edge pixel to the concept of transition pixel. Transition pixels are first identified, and the dark regions surrounded by the transition pixels are identified as foregrounds.

Among other notable approaches, there is an approach based on the Markov random field modeling of an input image, which regards the target binary image as a binary Markov random field that minimizes a cost function. Howe Howe2013 combined the Laplacian of an input image and edge detection results to define the cost function. This approach became the basis for the first winner of the 2016 hand written document binarization contest pratikakis2016icfhr2016 .

The aforementioned local adaptive thresholding approaches have been successfully applied for many applications and validated using many benchmark datasets such as multiple DIBCO datasets pratikakis2016icfhr2016 . Among the existing approaches, we follow and advance the background subtraction approach that first estimates the background of an input image and then applies a global thresholding on the outcome of subtracting the background estimate from the input image. The main contribution of this paper is to advance the background subtraction approach with a novel robust background estimation method and a global threshold selection approach. More details are summarized as follows:

  • The major contribution is to develop a robust background intensity surface estimation method. The existing approaches estimate the background intensity surface by interpolating the intensities of background pixels gatos2006adaptive or edge pixels parker1993thresholding , so a prior identification of background pixels or edge pixels is required. However, the prior identification is challenging under uneven background and heavy noises. The global polynomial smoothing lu2010document is a method that does not require any prior identification of edges or backgrounds, but it still requires edge detections for the subsequent foreground detection. In addition, we found that the method generates some image artifacts on its background estimate, mainly because the smoothing was individually applied to each row and column of an input image. Our new approach provides a more robust option for background estimation. It borrows a robust regression concept to formulate and solve the background estimation, which is less affected by other outlying image features such as foreground pixels and noise pixels. In addition, the proposed approach does not require prior identification of edges or backgrounds.

  • For the second thresholding step of our approach, we propose a global threshold selector distinct from many existing binarization approaches. Following the signal processing literatures donoho1994threshold , we formulate an image thresholding problem as a sparse regression problem to recover the true signal from a noisy signal, where the choice of a threshold is related to the determination of the signal sparsity parameter. We use a model selection criterion for selecting the sparsity parameter and thus the corresponding threshold. This approach is distinct from the existing binarization approaches that use the histogram of image intensities or statistics of edge pixels.

  • The practical values of the proposed approach is (a) to provide a robust option for image binarization for multiple contexts including document image binarization and microscopy analysis and (b) to improve the accuracy of the existing morphological image segmentation methods that require an accurate binary silhouette of foregrounds as inputs. We validated these points with 26 benchmark images, comparing to nine existing methods.

3 Method

Let denote the th pixel intensity of an input image of size , where foregrounds look darker than backgrounds. A local adaptive thresholding finds a local threshold for the th pixel to threshold the input image into a binary image,


All image pixel ’s where are foreground pixels, and the other pixels are background pixels. The background subtraction approach for a local thresholding applies the following threshold gatos2006adaptive ,

where represents the background intensity at the th pixel, and is a global threshold. When is known, the original thresholding (1) is equivalent to applying the following global thresholding to the subtraction of background from the input ,


The methods of estimating and determine the performance of the background subtraction approach. We propose novel approaches to estimate them. Section 3.1 describes the estimation of , and Section 3.2 describes the estimation of .

3.1 Robust regression for estimating

The input image is mixed with background , foreground and noise as


so estimating hidden under foregrounds and noises is not straightforward. One possible approach is to roughly estimate the pixel locations where and interpolate the intensities of the pixels to estimate like the existing approaches lu2010document ; gatos2006adaptive . However, finding the ’s that is as difficult as solving the original binarization problem (1). Another possibility is to apply a smooth regression that interpolates ’s. In many applications, the background intensity surface change smoothly over , while or adds intensity jumps on the smooth background. Under the circumstances, estimating the smooth background can be possibly achieved by fitting a regression model to ’s with a square loss and a smoothness penalty,


where is the nd order derivative of at , and is a tuning parameter that determines the degree of smoothness penalty. The smoothing turned out to be insufficient in our numerical experiment, where the estimated was still significantly affected by . Simply increasing smoothness penalty had not solved this issue. To be more specific, we looked at the optimal solution of (4) for an example electron microscope image, while varying from 1 to 1000000. Figure 2 shows one row of ground truth and the same rows of the solution of (4) with different ’s. As shown in Fig. 2-(a), the row contains a mild slope in the background (red line), while foreground objects make significant intensity ditches on the slope. Applying the smoothing spline (4) with a small is led to an overfit to deep foreground ditches as shown in Fig. 2-(b) and -(c). Increasing incurs a huge bias from the true background slope; see Fig. 2-(d). To reduce the effect of the foreground-caused-intensity-ditches on a background estimate, we borrow the concept of a robust regression in statistics rousseeuw2005robust . In Section 3.1.1, we describe how we formulate the background estimation problem, and the solution approach is described in Section 3.1.2.

Figure 2: Smoothing spline regression with square loss and different degrees of smoothness penalty .

3.1.1 Formulation

The basic idea of our formulation is as follows. We regards foreground-caused intensity ditches as outliers deviating from a smooth background intensity surface. Estimating a smooth background intensity surface is formated as a robust regression problem that fits a regression model to an input image with minimal effect of the outliers. In the statistical literature

rousseeuw2005robust , the square loss criterion for fitting is known to be sensitive to outliers. This is because the square loss is quickly increasing as absolute error increases, so the regression fitting procedure that minimizes the square loss is prone to overfitting to outliers to avoid a huge surge of the square loss. In the robust statistics literature rousseeuw2005robust

, more robust loss functions were proposed in the form of the weighted square loss,

where the weighting factor is defined to lower weights on the ’s that outliers locate; smaller weights on outlier regions makes the outcome of the regression less affected by outlying features. A popular choice for the weight factor is the Huber loss weight rousseeuw2005robust ,

where is a popular choice. The Huber loss places lower weights for higher absolute difference , so the effect of extreme outliers on the regression estimate can be mitigated. We use the Huber loss to formulate a robust regression for a background surface intensity,


In the next section, we describe our modeling choice of and our solution approach for problem (5).

3.1.2 Solution Approach: Boosting Regression

We model the background intensity surface as an additive model of products of two one-dimensional functions,


where is a one-dimensional function of row index , and is a one-dimensional function of column index

. This model has much less degrees of freedom than the full matrix of

; The degrees of freedom of the additive model is at most when a degree of freedom is placed every for and is placed every for , while the size of is . Depending on the choice of , it can model a simple background or a very complex background. For the time being, we assume is fixed, and we will later explain how can be chosen.

We solve problem (5) with in the additive form of (6). An additive form of a regression function can be naturally fit by the boosting regression (friedman2001elements, , Chapter 10). Based on the boosting regression procedure, we devise Algorithm 1. It starts with and sequentially expands with new functions through stages. Let denote the result of the boosting regression after the th stage. At the th stage, a new product function is added to such that plus the added term minimizes the objective function of (5) without changing the other terms added in the previous stages,

where is the nd order derivative of at . Define as the residual of fit after the th stage. The th stage is basically equivalent to fitting to the residual,


Once and are determined, the update is performed as the last step of the th stage. The number of the stages performed determines the number of product function terms in the additive model. We use the following stopping criterion to determine when we stop the sequential addition,

where is the L2-norm of a function that is approximated by its discrete version , and similarly . The stopping criterion practically implies that adding additional terms to the additive model is unnecessary when the last term added is ignorable. The overall algorithm is described in Algorithm 1.

0:  input image
0:  background intensity surface
1:  Initialization: residual and initial background intensity surface
2:  for  to  do
3:     Fit to , based on optimization (5).      [This optimization is solved by Algorithm 2.]
6:     If , stop and otherwise continue.
7:  end for
Algorithm 1 Boosting Regression

The remainder of this section is focused on detailing how to solve each stage formulated as problem (7), i.e., line 3 of Algorithm 1. We first model as a smooth function that interpolates discrete points , where the term ‘smooth function’ implies that the second order derivative of a function has a bounded magnitude. Therefore, , and the first and second order derivatives of are approximated by the central difference approximation of the first and second derivatives of ,

Similarly, we model as a smooth function that interpolates discrete points , so

With the modeling of and , we can restate the robust loss in problem (7) as follows:


and the smoothness term in the objective function of (7) is the Frobenius norm of the Hessian matrix of ,


Combining (8) and (9), we can rewrite the objective function of problem (7) as


Let us simplify the expression using some vector and matrix notations. Let

and . Let denote the matrix with as its th element, and let denote the matrix with as its th element. We also introduce a matrix to represent the quadratic term as , and introduce another matrix to represent the quadratic term as . Similarly, we introduce two matrices, and , for representing the quadratic terms and respectively. The vectorial form of (10) is

where is the Hadamard product operator and is the Frobenius norm.

Problem (7) that minimizes for and can be solved by the coordinate descent algorithm, which iterates two steps: (a) optimizing while fixing and (b) optimizing while fixing . We first derive the optimization procedure for with fixed . Please note that the Frobenius norm of an arbitrary matrix is

where is the vectorization of the matrix . The vectorization of is

where is the Kronecker product,

is an identity matrix of size

, and is a diagonal matrix with the elements of as its diagonal elements. Let , and . The previous expression is restated as

and the following Frobenius norm is

where . The partial derivative of with respect to is

where . Solving for gives the optimal solution for when is fixed,

Similarly, the update procedure for can be derived. Let . Using the properties of the vectorization, can be restated for as


The partial derivatives of with respect to are

where . Solving gives the optimal solution for when is fixed,

Once both of and is updated, the weighting factor is updated using


Updating , and ’s are repeated until convergence. The details of the algorithm is summarized in Algorithm 2.

0:  residual , , .
0:   and
1:  Initialization: . Set and . Set and with the first left and right singular vectors of .
2:  while  do
3:     Update using the formula (11).
4:     Update and .
5:     Compute and .
6:     Update and .
7:     Compute and
8:  end while
Algorithm 2 Iterative Optimization

The proposed algorithm has a tuning parameter that determines the smoothness of the background estimation. We run Algorithm 2 with different values of . Let and denote the outputs of the algorithm with a choice of . We choose based on the criterion,

The values of that we considered are .

The proposed algorithm was applied for the example data used in Figure 2. It worked very well as illustrated in Figure 3. More examples can be found in Section 4.1.

Figure 3: Illustrative Comparison. Smoothing Spline Regression vs Proposed Approach

3.2 Choosing

This section describes how to choose a global threshold that binarizes the background subtracted image as follows:

Once the background subtraction is subtracted, one may consider to apply the popular global threshold selector on such as Otsu otsu1979threshold . The Otsu threshold selector is based on the histogram of , and it works best when the histogram is bimodal. However, for most of our example images, the histograms looked unimodal, which is mainly because the number of foreground pixels is significantly dominated by the number of background pixels. We formulate the threshold selection problem as a model selection problem of a regression parameter and use a model selection criterion to select a threshold.

Let . From the literature friedman2001elements , it is well known that the hard-thresholded image is the optimal solution for the -penalized regression problem,

Therefore, the threshold selection of can be recast as the model selection problem of L0 penalty parameter . Many model selection criteria for a regularized regression were proposed, including the Akaike information criterion (AIC), Bayesian information criterion (BIC), statistics and model description length (MDL) friedman2001elements . More recently, the generalized model description length (gMDL) was proposed hansen2003minimum , and it takes a mixture form that can choose one in between two selection criteria, AIC and BIC, depending on which is best for data in hand. We use the gMDL criterion to select . Let , and , where is the indicator function. Based on (hansen2003minimum, , equation (16)), the gMDL for problem (3.2) is defined as follows: if the coefficient of determination for is less than , is

and otherwise. We evaluated for each value in all unique values ’s, and choose one that achieves the lowest gMDL value.

4 Numerical Evaluation

We evaluated our approach using three benchmark datasets that consist of 26 test images and the corresponding ground truth binary images. The first benchmark dataset is DIBCO 2011, which was provided as a part of the 2011 ICDAR document image binarization contest pratikakis2013icdar . As shown in Figure 4, the dataset consists of eight optical character recognition document images and their groundtruth binary images. The second benchmark dataset is H-DIBCO 2016, which was provided as a part of the 2016 ICFHR handwritten document image binarization contest pratikakis2016icfhr2016 , and it comes with ten scan images of handwritten documents as shown in Figure 5. The third dataset is NANOPARTICLE, which contains eight electron microscope images and corresponding ground truth binary images as shown in Figure 6. The microscope images were experimentally produced using a high-resolution transmission electron microscope by our collaborators at Pacific Northwest National Lab, and each of them contains tens to hundreds of nanoparticles over an uneven and noisy background. The ground truth binary images for the first two datasets are given as parts of the datasets, and those for the last dataset were manually generated in our lab.

We used the benchmark datasets to perform two kinds of evaluation. The first evaluation is to test the image binarization performance of our approach, with comparison to nine other local adaptive thresholding methods. The second evaluation is to show how the proposed approach improves the accuracy of morphological image segmentation methods in terms of segmenting overlapping foreground objects, because we believe that an improved binarization method can significantly improve the accuracy of morphological image segmentation methods that use the binary silhouette of foregrounds as an input. The outcomes of the first evaluation is summarized in Section 4.1, and the outcomes of the second evaluation is summarized and discussed in Section 4.2.

(a) image 1
(b) image 2
(c) image 3
(d) image 4
(e) image 5
(f) image 6
(g) image 7
(h) image 8
Figure 4: DIBCO11 Dataset
(a) image 1
(b) image 2
(c) image 3
(d) image 4
(e) image 5
(f) image 6
(g) image 7
(h) image 8
(i) image 9
(j) image 10
Figure 5: H-DIBCO16 Dataset
(a) image 1
(b) image 2
(c) image 3
(d) image 4
(e) image 5
(f) image 6
(g) image 7
(h) image 8
Figure 6: NANOPARTICLE Dataset

4.1 Image Binarization Outcomes

We compared the binarization performance of our proposed approach with nine state-of-the-art image binarization methods including NIBLACK niblack1985introduction , BERNSE bernse1986dynamic , GATOSgatos2006adaptive , BRADLEY bradley2007adaptive , SAUV sauvola2000adaptive , PHAN phansalkar2011adaptive , LU lu2010document , SU su2013robust , and HOWE Howe2013 ; the HOWE was the base of the first place winner of the 2016 ICFHR handwritten document image binarization contest pratikakis2016icfhr2016 . The performance metrics that were used for the document binarization contest are applied in this paper, including the F-measure (FM), pseudo F-measure (PFM), peak signal-to-noise ratio (PSNR), distance reciprocal distortion metric (DRD) and misclassification penalty metric (MPM). The FM is based on the pixel-wise binarization recall and precision,

where and are the binarization recall and precision respectively. The PFM uses the pseudo binarization recall and precision, which consider distance weights of pixels to the nearest contours of foregrounds in computing the recall and precision. The PSNR is , where and refer the mean square error and the average intensity level difference between foregrounds and backgrounds. The DRD has been used to measure the visual distortion of an estimated binary image from its ground truth counterpart; its complex formula is given in literature lu2004distance . The MPM is defined as

where is a scaling constant, is the number of false negatives, is the number of false positives respectively, is the distance from the th false negative to the nearest foreground contour pixel in the ground truth, and is the distance from the th false positive to the nearest foreground contour pixel in the ground truth. Higher FM, PFM and PSNR are better, while lower and are better.

Table 1 presents the numbers of the performance metrics for the first dataset, and Table 2 presents those for the second dataset. The numbers in the tables are the performance metrics averaged over all test images in each dataset. The performance metrics of our proposed approach are very comparable to those of the two top performers for the first dataset and the second dataset. The first two datasets contain little noises but complex background patterns due to document degradation and imperfectly erased handwritings. Our approach has shown competitive performance in handling such complexities. In particularly, the proposed approach performed better than the existing approaches for image 1 and image 4 in the first dataset and for images 4, 8, and 10 in the second dataset. Those images have more complicated background variations than the other documents images. Figures 7 and 8 show illustrative outcomes of our approach for those images. In the figures, we can see that the backgrounds estimated by our approach successfully captured the complex patterns of document image backgrounds, so the foreground estimates were not much affected by the image backgrounds.

Table 1: Five performance metrics of nine state-of-the-art image binarization methods and our proposed approach for the DIBCO 2011 dataset.
Figure 7: Results of the proposed approach for the DIBCO 2011 dataset
Figure 7: Results of the proposed approach for the DIBCO 2011 dataset
Table 2: Five performance metrics of nine state-of-the-art image binarization methods and our proposed approach for the H-DIBCO 2016 dataset.
Figure 8: Results of the proposed approach for the H-DIBCO 2016 dataset
Figure 8: Results of the proposed approach for the H-DIBCO 2016 dataset

On the other hand, our proposed approach outperformed the nine state-of-the-art binarization methods significantly for the last dataset. Table 3 summarizes the five performance metrics for the last dataset. The major difference of the last dataset from the previous two datasets is that the last dataset has significantly higher background noises and larger foreground sizes. When a noise level is very high (i.e., the signal-to-noise ratio of an input image is low), local image contrasts are significantly affected by image noises, which causes the methods based on a local image contrast map (such as su2013robust ) less competitive for the last dataset. When foreground sizes are large and noises are severe, estimating the image background accurately is quite challenging. This is why the background subtraction method such as lu2010document gatos2006adaptive did not work well for the last dataset. Under the circumstances, our proposed approach is still able to capture the background very robustly. The background estimates of our approach are presented in Figure 9, which shows the robustness of our background estimator.

Table 3: Five performance metrics of nine state-of-the-art image binarization methods and our proposed approach for the NANOPARTICLE dataset.
Figure 9: Results of the proposed approach for the NANOPARTICLE dataset
Figure 9: Results of the proposed approach for the NANOPARTICLE dataset

4.2 Effects on Image Segmentation

Our approach performed well in image binarization for all 26 test images. The great image binarization outcomes of the proposed approach can be used in a morphological image segmentation method to improve the overall accuracy of identifying individual foreground objects under overlaps. This section shows how our approach can improve the existing morphological image segmentation methods.

We use the NANOPARTICLE dataset for this study, which contains eight microscope images of overlapping nanoparticles; the other two datasets do not have any foreground overlap issues, so those were not included in this study. Table 4 summarizes the characteristics of the eight images in terms of the signal-to-noise ratio (SNR), the background variation and the foreground density; the definitions of the characteristics are described in the table caption. Images having higher foreground densities contain more foreground overlaps, e.g., Img 1, Img 2 and Img 5. Three morphological image segmentation methods are considered in this testing, including ultimate erosion for convex sets (UECS) park2013segmentation , bounded erosion with fast radial symmetry (BE-FRS) zafari2015segmentation and morphological multiscale method (MSD) schmitt2009morphological , which are specialized for segmenting overlapping objects in microscope images. We counted the number of falsely identified foreground objects (false positives) and the number of unidentified identified foreground objects (false negatives) for the three methods when their built-in image binarization are applied as well as when our approach replaced the built-in binarization.

Table 5 summarizes the number of false positives and false negatives for the eight test images. The morphological image segmentation methods produced significantly many false negatives and false positives in particular for the co-existence of background variations and high noise levels such as Img 3 and Img 5. This is mainly because the built-in image binarization algorithms in the morphological image segmentation methods worked poorly for the test images. Using the proposed binarization outcome as input to the morphological image segmentation methods significantly reduced the numbers of false positives and false negatives. Figure 10 illustrates the image segmentation outcomes of the UECS park2013segmentation and the proposed approach combined with the UECS for Img 3 and Img 5. This illustration clearly shows how the proposed approach improved the existing morphological image segmentation methods for complex image segmentation works.

# of Foregrounds SNR() Foreground Density
Img 1 86 31.22 112.56 26.80%
Img 2 162 1.53 1.15 11.28%
Img 3 83 2.13 2719.80 5.88%
Img 4 69 1.51 81.02 0.61%
Img 5 351 1.41 91.05 17.74%
Img 6 8 1.31 787.01 0.72%
Img 7 6 1.36 174.69 1.14%
Img 8 140 1.14 56.24 2.80%
Table 4: Characteristics of test microscope images. The

is the variance of image intensities,

is the variance of noises, and is the variance of background intensities. The foreground density is the fraction of the number of foreground pixels in each image.
UECSpark2013segmentation Proposed FRSzafari2015segmentation Proposed MSDschmitt2009morphological Proposed
+ UECSpark2013segmentation + FRSzafari2015segmentation + MSDschmitt2009morphological
Img 1 3 0 1 0 3 1 1 0 3 0 6 0
Img 2 21 5 2 18 12 20 3 17 0 499 8 11
Img 3 32 2 10 2 17 30 9 0 30 1 10 2
Img 4 69 0 23 4 52 5 39 9 65 838 15 7
Img 5 183 7 18 0 67 17 26 0 201 796 18 0
Img 6 2 0 0 0 1 603 0 0 7 19 0 0
Img 7 1 2 0 1 2 1419 0 1 5 707 0 1
Img 8 138 0 32 0 76 24 60 0 26 704 32 0
Table 5: Performance of the existing morphological image segmentation methods combined with the proposed image binarization. FN = the number of unidentified foreground objects. FP = the number of falsely identified foregrounds.
Figure 10: Comparison of UECS park2013segmentation with UECS + Proposed Approach. With significant background variation, the original UECS missed a number of foreground objects. If the binarization outcome of the proposed approach is used in the UECS, the foreground detection can be significantly improved.

5 Conclusion

We presented a new approach that solves an image binarization problem under significant background variations and noises. The approach basically estimates the background intensity variation of an input image and subtracts the estimate from the input to achieve a flat-background image, which is thresholded by a global thresholding approach to get the final binary outcome. For robust estimation of a background intensity variation, we proposed a robust regression approach to recover a smooth intensity surface of an image background which is buried under foreground intensities and noise intensities. A global threshold selector was proposed, based on a model selection criterion. With the improved background estimator and threshold selector, the proposed approach has shown great binarization performance quite uniformly for all of our 26 benchmark images including 18 document images and eight microscopy images. We also showed how the improved binarization performance can be used for complex image segmentation works of segmenting overlapping foregrounds under uneven background and heavy noises. We believe that the proposed approach has great values in robust image binarization and image segmentation.


The authors would like to acknowledge support for this project. This work is partially supported by NSF 1334012, AFOSR FA9550-13-1-0075, AFOSR FA9550-16-1-0110, and FSU PG 036656.



  • (1) S. Lu, B. Su, C. L. Tan, Document image binarization using background estimation and stroke edges, International Journal on Document Analysis and Recognition 13 (4) (2010) 303–314.
  • (2) N. Otsu, A threshold selection method from gray-level histograms, IEEE Transactions on Systems, Man, and Cybernetics 9 (1) (1979) 62–66.
  • (3) C. Park, J. Z. Huang, J. X. Ji, Y. Ding, Segmentation, inference and classification of partially overlapping nanoparticles, IEEE Transactions on Pattern Analysis and Machine Intelligence 35 (3) (2013) 1–1.
  • (4) O. Schmitt, M. Hasse, Morphological multiscale decomposition of connected regions with emphasis on cell clusters, Computer Vision and Image Understanding 113 (2) (2009) 188–201.
  • (5) S. Zafari, T. Eerola, J. Sampo, H. Kälviäinen, H. Haario, Segmentation of overlapping elliptical objects in silhouette images, IEEE Transactions on Image Processing 24 (12) (2015) 5942–5952.
  • (6) N. Malpica, C. O. de Solórzano, J. J. Vaquero, A. Santos, I. Vallcorba, J. M. García-Sagredo, F. del Pozo, Applying watershed algorithms to the segmentation of clustered nuclei, Cytometry 28 (4) (1997) 289–297.
  • (7) F. B. Tek, A. G. Dempster, I. Kale, Blood cell segmentation using minimum area watershed and circle radon transformations, in: Mathematical morphology: 40 years on, Springer, 2005, pp. 441–454.
  • (8)

    P. U. Adiga, B. Chaudhuri, An efficient method based on watershed and rule-based merging for segmentation of 3-D histo-pathological images, Pattern Recognition 34 (7) (2001) 1449–1458.

  • (9) Y.-K. Lai, P. L. Rosin, Efficient circular thresholding, IEEE Transactions on Image Processing 23 (3) (2014) 992–1001.
  • (10) M. Sezgin, et al., Survey over image thresholding techniques and quantitative performance evaluation, Journal of Electronic imaging 13 (1) (2004) 146–168.
  • (11) P. Stathis, E. Kavallieratou, N. Papamarkos, An evaluation technique for binarization algorithms, Journal of Universal Computer Science 14 (18) (2008) 3011–3030.
  • (12) W. Niblack, An introduction to digital image processing, Strandberg Publishing Company, 1985.
  • (13) B. Gatos, I. Pratikakis, S. J. Perantonis, Adaptive degraded document image binarization, Pattern Recognition 39 (3) (2006) 317–327.
  • (14) J. Sauvola, M. Pietikäinen, Adaptive document image binarization, Pattern Recognition 33 (2) (2000) 225–236.
  • (15) N. Phansalkar, S. More, A. Sabale, M. Joshi, Adaptive local thresholding for detection of nuclei in diversity stained cytology images, in: Communications and Signal Processing (ICCSP), 2011 International Conference on, IEEE, 2011, pp. 218–220.
  • (16) J. Bernse, Dynamic thresholding of grey-level images, in: Proceedings of the 8th International Conference on Pattern Recognition, 1986, 1986, pp. 1251–1255.
  • (17) B. Su, S. Lu, C. L. Tan, Binarization of historical document images using the local maximum and minimum, in: Proceedings of the 9th IAPR International Workshop on Document Analysis Systems, ACM, 2010, pp. 159–166.
  • (18) B. Su, S. Lu, C. L. Tan, Robust document image binarization technique for degraded document images, IEEE Transactions on Image Processing 22 (4) (2013) 1408–1417.
  • (19) K. Mieloch, P. Mihailescu, A. Munk, Dynamic threshold using polynomial surface regression with application to the binarization of fingerprints, in: Defense and Security, International Society for Optics and Photonics, 2005, pp. 94–104.
  • (20) J. R. Parker, Gray level thresholding in badly illuminated images, IEEE Transactions on Pattern Analysis and Machine Intelligence 13 (8) (1991) 813–819.
  • (21) J. R. Parker, C. Jennings, A. G. Salkauskas, Thresholding using an illumination model, in: Document Analysis and Recognition, 1993., Proceedings of the Second International Conference on, IEEE, 1993, pp. 270–273.
  • (22) Q. Chen, Q.-s. Sun, P. A. Heng, D.-s. Xia, A double-threshold image binarization method based on edge detector, Pattern Recognition 41 (4) (2008) 1254–1267.
  • (23) M. A. Ramírez-Ortegón, E. Tapia, L. L. Ramírez-Ramírez, R. Rojas, E. Cuevas, Transition pixel: A concept for binarization based on edge detection and gray-intensity histograms, Pattern Recognition 43 (4) (2010) 1233–1243.
  • (24) N. R. Howe, Document binarization with automatic parameter tuning, International Journal on Document Analysis and Recognition 16 (3) (2013) 247–258.
  • (25) I. Pratikakis, K. Zagoris, G. Barlas, B. Gatos, Icfhr2016 handwritten document image binarization contest (H-DIBCO 2016), in: Frontiers in Handwriting Recognition (ICFHR), 2016 15th International Conference on, IEEE, 2016, pp. 619–623.
  • (26) D. L. Donoho, I. M. Johnstone, Threshold selection for wavelet shrinkage of noisy data, in: Engineering in Medicine and Biology Society, 1994. Engineering Advances: New Opportunities for Biomedical Engineers. Proceedings of the 16th Annual International Conference of the IEEE, Vol. 1, IEEE, 1994, pp. A24–A25.
  • (27)

    P. J. Rousseeuw, A. M. Leroy, Robust regression and outlier detection, Vol. 589, John Wiley & Sons, 2005.

  • (28) J. Friedman, T. Hastie, R. Tibshirani, The elements of statistical learning, Vol. 1, Springer series in statistics Springer, Berlin, 2001.
  • (29) M. H. Hansen, B. Yu, Minimum description length model selection criteria for generalized linear models, Lecture Notes-Monograph Series 40 (2003) 145–163.
  • (30) I. Pratikakis, B. Gatos, K. Ntirogiannis, Icdar 2013 document image binarization contest (DIBCO 2013), in: Document Analysis and Recognition (ICDAR), 2013 12th International Conference on, IEEE, 2013, pp. 1471–1476.
  • (31) D. Bradley, G. Roth, Adaptive thresholding using the integral image, Journal of Graphics, GPU, and Game Tools 12 (2) (2007) 13–21.
  • (32) H. Lu, A. C. Kot, Y. Q. Shi, Distance-reciprocal distortion measure for binary document images, IEEE Signal Processing Letters 11 (2) (2004) 228–231.