Unsupervised image segmentation via maximum a posteriori estimation of continuous max-flow

11/01/2018 ∙ by Ashif Sikandar Iquebal, et al. ∙ 22

Recent thrust in imaging capabilities in medical as well as emerging areas of manufacturing systems creates unique opportunities and challenges for on-the-fly, unsupervised estimation of anomalies and other regions of interest. With the ever-growing image database, it is remarkably costly to create annotations and atlases associated with different combinations of imaging capabilities and regions of interest. To address this issue, we present an unsupervised learning approach to a continuous max-flow problem. We show that the maximum a posteriori estimation of the image labels can be formulated as a capacitated max-flow problem over a continuous domain with unknown flow capacities. The flow capacities are then iteratively obtained by considering a Markov random field prior over the neighborhood structure in the image. We also present results to establish the consistency of the proposed approach. We establish the performance of our approach on two real-world datasets including, brain tumor segmentation and defect identification in additively manufactured surfaces as gathered from electron microscopic images. We also present an exhaustive comparison with other state-of-the-art supervised as well as unsupervised algorithms. Results suggest that the method is able to perform almost comparable to other supervised approaches, but more 90 terms of Dice score as compared to other unsupervised methods.



There are no comments yet.


page 2

page 5

page 7

page 9

page 10

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

THE goal of image segmentation is to cluster pixels of an image into logical entities/objects based on their intensity values and the neighborhood structure [1]. Towards this, mainly two lines of thought have existed—graph-based approaches [2, 3]

and Gaussian mixture models (GMM)

[4, 5, 1]. GMM are typically sensitivity to noise, as they depend mostly on the distribution of pixel intensities. In contrast, graph-based approaches are region based, i.e., the resulting segmentation is influenced by the neighborhood structure, and guarantees optimality in terms of the energy functions (e.g., see Eq. (1) below) [6, 7]. More recently, the graph-based approaches has been popularized in various fields, including medical image segmentation, object tracking in static and video surveillance and many more [2, 8, 9].

In general, graph-based approaches can be formulated as an energy minimization functional over a discrete image grid [10] of the form:


where accounts for the measure of fit of the labels , is the number of segments, and controls the smoothness of the segmentation. Note that the discretization of the image domain typically results in metrication error due to the grid bias [11]. To overcome this, we consider the image domain to be continuous, i.e., a bounded continuum of locations in , originally presented by Strang [12]. However, a major short-coming is that the graph-based approaches are mostly supervised or require manual parameter settings (e.g., the flow capacities in the min-cut/max-flow problem [13]).

In most graph cut algorithms[14, 15, 16], this is generally resolved by requiring the users to define the object of interest and background (e.g., by selecting sample pixels in background and foreground) to train the model. Apart from this, they use annotated atlases (reference templates) or training data, especially for the segmentation of anatomical structures in-ever increasing medical images. Given the atlas, an image can be segmented by defining “anatomically correct” mapping from the coordinate space of input image to the atlas [17]. An excellent account of segmentation approaches built on annotated atlases can be found in [18]. However, creating annotations or sample pixels to define Region Of Interest would not only require expert knowledge, but are many a times infeasible. For example, tumor detection from magnetic resonance images involve investigation of more than hundred slices, that may vary significantly from patient to patient or even over time. Another growing research area that can harness the power of unsupervised segmentation is the autonomous characterization of microscopic images for identification of defect concentration in parts fabricated via advanced manufacturing methods, e.g., additively manufacturing and could be a crucial step towards realizing the materials genome initiative [19].

Towards this, several variational approaches have been reported in the literature to minimize over a spatially continuous domain. Rudin, Osher and Fatemi (ROF) model [20] is one of the earliest known level set-based variational approaches, followed by several variants [21, 22]

. However, the ROF as well as its derivative models involve minimizing an energy functional over characteristic functions of sets, resulting in a non convex model and consequently, sub-optimal convergence. Later on Chan, Esedoglu and Nikolova presented a convex relaxation of ROF model—a continuous min-cut approach—that can be globally solved and is also the minimizer of the original ROF model after appropriate thresholding

[9]. Later on, authors in [23] presented a faster and parallelizable algorithm of the continuous min-cut model using the dual capacitated max-flow model. Nonetheless, the use of expert supervision and annotated atlases remain inevitable.

Literature on unsupervised segmentation has been mostly limited. Some of the most notable works include that of normalized cuts based on spectral graph partitioning [2], with by several extensions thereafter [24, 25, 26]; mean shift clustering using the neighborhood pixel information [27]

; k-means; expectation maximization based Markov random field-based methods

[28] as well as mixture models [4, 5, 1]. Additionally, some of the recent works have also focused on semi-supervised methods, .e.g, GMM guided graph cuts with bounding box as user input instead of pixel annotation [15], scribbles to partially annotate part of an image [14] and image tags that only specifies what classes are present in the image [16]. However, significant limitations exist, specially for the case of purely unsupervised approaches in terms of computational complexity, parameter selection (e.g., kernel function and window size in mean-shift), sensitivity to noise and over/under segmentation [29].

Fig. 1: Scanning electron micrographs showing the (a) formation of defects as a result of balling effect, i.e., trapping of undiffused powder particles inside the pores during a selective laser melting process [30], and (b) morphology of defects and undiffused powder particles when fabricated using an electron beam melting [31].

In this paper, we present a unsupervised approach to continuous capacitated max-flow segmentation by using an iterative maximum likelihood estimation (MLE) of the flow capacities. More specifically, we show that the maximum a posterior (MAP) estimate of segmentation problem is analogous to the max-flow formulation for graph cuts, given the flow capacities are known. This allows for the MAP estimation of the segmentation problem. The flow capacities are then iteratively estimated by using MLE until convergence. We also present theoretical results to establish the consistency of the MAP-MLE approach, and show that the method is computationally tractable. To establish the numerical performance of the methodology, we present comparative results on benchmark tumor datasets as well as a case study on the segmentation of defect concentration in additively manufactured components fabricated with different process parameters. The paper is organized as follows: In Section 2, we present the proposed continuous max-flow based segmentation approach. In Section 3, we show that the MAP estimate of the segmentation problem given the flow capacities is analogous to the max-flow formulation and iteratively derive the ML estimation of the flow capacities. We show that the proposed ML estimation of the flow capacity is consistent in Section 4. In section 5, we present the comparative results from experimental investigations with application in the brain tumor and defects segmentation in additively manufactured surface. We conclude the paper in Section 6.

2 Continuous max-flow formulation

In the following we first present the continuous max-flow formulation with flow capacities for image segmentation. Subsequently, we show that the current implementations of the max-flow problem are limited and result in sub-optimal solutions due the fact that flow capacities are generally unknown and need to be estimated.

Let denote the continuous image domain (e.g., see Figure 1) and denote some pixel location in . The problem of partitioning the continuous image domain into disjoint subdomains can be expressed as the following energy minimization problem: [2, 8],

subject to

where is the label assigned to a location and is the perimeter of the subdomain . With reference to Figure 1(a), the subdomains may represent the defects on the surface, e.g., porosity, or the image background itself. The first term in Equation (2) evaluates the likelihood of assigning labels to locations . This is the term in Equation (1). A classical example is the Mumford-Shah functional [32] that uses an norm to evaluate the goodness of fit of the labels, i.e., , where is the observed label of pixel . The second term enforces smoothness of the segmentation by minimizing the perimeter of each of the subdomains and is analogous to in Equation (1

). By imposing the smoothness penalty, we avoid over segmentation of the image. Markov random field (MRF) priors have been widely used to capture spatial smoothness. Essentially, it states that the probability of any label configuration

is given as,


where is the clique potential for a pair of neighboring locations and . Boykov et al., [33] proposed a generalized Potts model to define the clique potentials involving a pair of neighboring locations as,


Let denote an indicator function such that,


we have, . Therefore, we can rewrite the energy functional as,

subject to

Note that the formulation in Equation (6) is non-convex due to the binary configuration of . Several convex approximations of Equation (6) has been proposed in the literature by relaxing to the interval [34, 35]. Authors in [9] showed that the optimal solution to Equation (6) can be obtained by thresholding the resulting convex problem. However, the numerical algorithms for the model in Equation (6) suffers from the non-smoothness of the total variation term [36]. To overcome this, we show in the following, that the dual of the energy minimization problem in Equation (6) is analogous to the continuous max-flow problem studied in [23, 34]. Towards this, we consider a specific formulation of Equation (6) as derived in [9],


Note the relaxation on to . Using the fact that (see [37]), we rewrite Equation (7) as the following min-max problem,

Now, we can write the dual of Equation (6) as,

subject to

The resulting formulation in Equation (8) is the continuous analogue of the max-flow problem (also see [12] for a more formal derivation). To understand this max-flow formulation in the continuous domain, we refer to Fig. (2). In the continuous image plane , each pixel is connected to the source and sink . Following [12], we consider that each pixel is associated with three different flows: the source flow , the sink flow and the spatial flows . Here, source and sink flow fields are directed from the source to each point and from each to the sink , respectively. The spatial flow field is characterized by the undirected flow through —capturing the strength of interaction with the neighborhood locations. Each of the flow fields are constrained by their respective capacities as,


In addition, the flow fields satisfy the flow conservation constraint, i.e.,

Fig. 2: Continuous image domain with source, sink and spatial flow represented by for each pixel .

Thus the continuous max-flow problem can be defined as the maximization of the total flow from the source as,


subject to the constraints in Equations (9)-(10). To solve the minimization problem in Equation (8), we first write the corresponding Lagrangian using the multiplier as,

subject to

With this, we can write the following result,


Optimizing the dual variables in Equation (12) is equivalent to minimizing the corresponding primal problem in Equation (7).

Proof of the proposition follows from the Equations (7)-(8). Now we can solve the maximization problem in Equation (12) by using the augmented Lagrangian approach presented in [23] as,


where . See Algorithm 1[23] for solving the augmented Lagrangian using the projection-gradient descent approach.

Fig. 3: (a) Piecewise constant image with 3 subdomains, (b) corrupted with Gaussian noise with final SNR = 7 dB. (c) Segmentation of the image using continuous max-flow approach with fixed capacities resulting in a sub-optimal segmentation.

As mentioned earlier, in unsupervised max-flow problems, the flow capacities corresponding to the source, sink as well as the spatial field flows are set to a fixed value a priori by the user or requires user input to define foreground and background [ref]. Nonetheless, the capacities are generally unknown and may require trial and error or prior training. Additionally, for images with low SNR, fixed capacity values for all the pixels would cause the solution to trap in the local minima, thus failing to determine the global optimum (see Figure 3 for an example). To overcome these challenges, we present an approach to define spatially varying flow capacities. In the following, we first formulate the MAP estimate of the segmentation problem, i.e., for a given estimate of the flow capacities and show that the MAP formulation is analogous to solving the continuous max-flow problem. With current estimate of , we subsequently update the flow capacities in the next iteration using a maximum likelihood approach.

Initialize :  and
1 repeat
2       for every subdomain  do
3             %update the spatial flows
4             , where ;
5             %update the sink flows;
6             , where ;
7             %update the source flows;
8             , where ;
9             %update the multipliers;
11       end for
13until convergence;
Algorithm 1 Continuous max-flow using gradient projection

3 Maximum a posteriori/maximum likelihood estimation

As mentioned earlier, in unsupervised max-flow/min-cut problems, the value of capacities corresponding to the source, sink as well as the spatial field flows are set to a fixed value a priori by the user, introducing subjectivity into the segmentation. Assuming a constant capacity constraint ignores the neighborhood information, and therefore the solution is highly sensitive to the noise. In fact, convergence of the label configuration in most max-flow/min-cut based approaches is dependent on the source, sink and spatial flow field capacities. We show this in the following result.


Let be some label configuration (but not the optimal) with and as the source, sink and spatial flows, respectively. Let the corresponding source, sink and spatial capacities be equal to the optimal flows, i.e., and . Then the label configuration , where is the true configuration.


To derive the proof, we first consider the following subproblems:




Let is the optimal solution of (14). Under optimality, let us first consider that the location is unsaturated, i.e., . Under this condition any change in the source flow field at will not improve . This gives,


Here, indicates that at location , the source flow has no contribution to the total energy of the cut. In order to characterize the effect of perturbations in the flow field capacities on the , we consider only the saturated flows, i.e., the flows for which . By the method of variations, any change in , i.e., would not increase and therefore, . However, increase in the source capacity to

will make the source flow field unsaturated, i.e., , forcing more flow from the source node such that . However, as does not change, . Similarly decreasing the source capacity decreases the source flow field, i.e., and therefore, . In either case, the optimality is lost. In a similar way, we can show the case for sink and spatial flows.

To overcome these challenges, we estimate spatially varying flow capacities. We first formulate the MAP estimate of the segmentation problem, i.e., and show that the estimation of the MAP is analogous to solving the continuous max-flow problem.

For each of the image subdomains , we assign the source, sink and spatial capacities to each of the subdomains as and . We consider the capacities to be spatially varying over the domain , but are fixed within each of the subdomains. Given the continuous image domain, and the flow capacities, the objective is to find the label configuration

that maximizes the posterior probability given as,


As we mentioned earlier, the flow capacities are not known a priori and is difficult to estimate, especially for images with low SNR ratio (e.g., see Figure 3). In light of this, we use an iterative learning approach to estimate the flow capacities, and as,


Given the pixel intensities in the image domain

, the joint distribution of the flow capacities

and the label configuration

can be written using the Bayes theorem as,


Here, the first term is nothing but the likelihood of the labels and the capacities given the image domain , is some prior over the image domain given the capacities and denotes the prior distribution representing the capacities. To appropriately define the prior distribution over the capacities it is important to understand that the flow capacities should have similar characteristics as that of the optimal flow. That is, the capacities should be smoothly varying over the image domain as well as capture the spatial correlation between the image pixels. In fact, for an optimal solution, we note from Theorem 1 that and . In the following, we focus only on the estimation of the source capacity . Similar results may be obtained for the sink and spatial flow field capacities.

With that, one of most commonly used prior to incorporate the spatial smoothness is the Markov random field (MRF) prior given as,


where is a normalizing constant, is a scale factor and is a smoothing function and controls the spatial correlation between the pixels in a given neighborhood. The prior follows from the Hammersley-Clifford Theorem [ref] when assuming local Markovian property over the flow field, that is, . It may be noted that local Markovian property is quite natural for random field defined over an image domain.

Several choices of has been proposed in the literature [38, 39]. A typical choice for is the neighborhood based Gauss-Markov random field prior where is given as:


Authors in [40] proposed the following form for to penalize the larger values of

while increasing the robustness to the outliers,


However, as pointed out it [1], the functional form of in Equations (21)&(22) result in complex log-likelihood functions making it computationally intensive. To overcome the computational complexity, we use the prior as proposed in [1],


where is defined as


where is the cardinality of the neighborhood of , and are the current estimates of the source flow field and source flow capacity, and is a smoothing constant. An inherent advantage of using the smoothing function given in Equation (23) is that when maximizing the log-likelihood, the derivative is dependent only on the term ; contributing to the computational efficiency. In the present implementations, we set the value of and the neighborhood size as over such that . Different values result in different convergence rates.

Returning to Equation (19), we first write the corresponding log-likelihood function as,


Here, the first term represents the likelihood of the capacity given the label configuration and the image domain . In other words, it imposes a penalty for every incorrect assignment of the labels. Assuming each of the observations are independently and identically distributed we can write,


such that,


is some loss function for assigning label

to location (see (2)). The second term in Equation (25) represents the prior over the label configuration . We consider a gradient based smoothness prior for given as,


where is the smoothness penalty and is given as

Note that we only provide a generic form for and . More specialized forms can also be used. Using, the generalized penalty function and the gradient based regularization, as introduced in Equation (6), we rewrite Equation (25) as,


Note that maximizing the log-likelihood function in Equation (29) amounts to solving the max-flow problem as in Equation (6) given the capacities are known. We use the Algorithm 1 as proposed in [23] to find the optimal and . With the current estimate of and , we solve for the optimal value of by maximizing with respect to , i.e., we solve the following problem,


We solve for the next step and is given as,

Initialize : Initialize the parameters and
1 repeat
2       Solve the continuous max-flow in Equation (12) to estimate and ;
3       Evaluate the current estimate of in Equation (24);
4       Update the capacities as
8until convergence;
Algorithm 2 Iterative MLE of source, sink and spatial capacities

The algorithm for simultaneous update of and is given in Algorithm 2. Other flow capacities can be updated in a similar way.

4 Convergence and consistency

To check that the algorithm convergences, we analyze the likelihood function given in Equation (29) after every iteration of MAP and ML estimation. Let the likelihood function after some iteration be . In the MAP estimate of the segmentation, we solve the max-flow problem, therefore, the log-likelihood at iteration will increase (or remain same), i.e, (this is evident with and as given in Equation (23). The next step involves updating the capacities using the ML estimates. In this step the likelihood function further decreases or remains same as this step only changes the flow capacities without changing the estimated labels. Therefore, every instance of MAP and ML estimations result in the increase in the likelihood function.

Fig. 4: Comparative results of different algorithms tested for the segmentation of defects. (a) the proposed method (b) mean shift (c) normalized cuts (d) blobworld (e) hierarchical image segmentation (f) GMM and (g) Spatially constrained GMM.

To show that the algorithm consistently estimates the flow capacities, we refer to the classical consistency results of MRF prior given in [41]. In this regard, we first consider the following lemma.


The function is identifiable and is a concave objection function.


Identifiability: We show this for the case of . From Equations (7,23&24), we have

where is some constant. Clearly is one-to-one and therefore, identifiable.
Concavity: As logarithm is concave in the domain of

, the linear transform

is also concave.

With the conditions stated in Lemma 1, we state the following consistency Theorem as derived in [41],


Given is identifiable and concave. The MLE is always consistent, i.e.,

See [41] for the proof. Based on aforementioned results, we note that every instance of MAP and ML estimations result in the increase in the likelihood function, ensuring that the algorithm consistently converges. In the next section, we present the implementation results and comparison with some state-of-the-art methods.

5 Experimental results

5.1 Case studies and evaluation metrics

To test the efficacy of the proposed methodology, we investigate two real-world case studies, including Brain Tumor Image Segmentation (BRATS, datasets from 2013 and 2015) and defect localization in additively manufactured (AM) surfaces gleaned from electron microscopy. An important distinction between the two datasets is that the former is sufficiently large to enable supervised learning. In contrast, high resolution electron microscopic images are extremely costly to capture from AM surface mostly due to the uncertainties in the surface morphology, limiting the availability of such images. As most of the segmentation approaches in the past are either supervised or involves user interaction and manual parameter setting, we compare the segmentation results from each of the cases studies with different state of the art algorithms.

For the first case study, we refer to the algorithms reported in the Medical Image Computing and Computer Assisted Intervention Society (MICCAI) proceedings 2013 and 2015 respectively [42]

. In the 2013 MICCAI proceedings, 20 algorithms were reported that included (5) generative, (13) discriminative and (2) generative-discriminative algorithms. In the 2015 BRATS dataset, (3) generative, (2) discriminative and (7) neural network/deep learning approaches were reported. Additionally, we also implement the state of the art unsupervised algorithms, including k-means, expectation maximization (blobworld)

[43], Gaussian mixture model (GMM), GMM with spatial regularization (SCGMM) [1], mean shift [27], normalized cuts [2], and hierarchical image segmentation [26]. Due to the limited dataset in the second case study, we compare the performance with only unsupervised algorithms. Among the methods tested, the ones that did not perform or resulted in poor segmentation are not reported.

To quantitatively compare the performance of the segmentation results, we refer to the standard Dice score and the Hausdorff distance. The Dice score is given as


where and represents the estimated lesion region and the expert segmentation, respectively and represents the size of the domain. Dice score measures the areal overlap or agreement between the segmented and ground truth. The Hausdorff distance given in Equation (34) instead measures the surface distance between the segmentations,


where is the shortest Euclidean distance between and . Maximum over the supremum distances make Hausdorff distance highly sensitive to the outliers in detection. To control this, we use the 95th percentile of the Hausdorff distance as suggested in [42].

Fig. 5: Comparative results of different algorithms tested for the segmentation of brain tumor on the BRATS 2013 dataset. Box plot adopted from [42]

5.2 Brain Tumor Segmentation

The World health Organization has classified brain and central nervous system tumors into more than 120 categories, along with a grading scale specifying the degree of malignancy (on the scale of I-IV). However, almost 80% of the malignant tumors fall in the gliomas category alone. Despite significant advances over the past few years, the diagnosis of the gliomas remain limited. Neuroimaging offers a noninvasive approach to evaluate the progression of the tumorous tissues, thus allowing timely intervention and controlled treatment. However, significant challenges arise when diagnosing the MR images as the lesions (tumorous regions) are defined only in terms of the contrast changes with respect to the surrounding normal tissues. This often causes significant variations among the expert segmentation mainly because of the intensity gradient from the tumorous to non-tumorous regions. In addition, these lesions may vary significantly in terms of size, shape and localization from patient to patient, as well as across subsequent stages of tumor growth, requiring large training datasets for an effective supervised learning approach.

Fig. 6: Comparative results of different algorithms tested for the segmentation of defects. (a) Original image for sample A (b) k-means with 2 clusters (c) Gaussian mixture model with expectation maximization (d) spatially constrained Gaussian mixture model with k-means initialization and (e) mean shift (f) the proposed method.

In light of this, we study the segmentation of the brain tumor from fluid-attenuated inversion recovery (FLAIR) MR scans that highlights the differences in water relaxation properties of the tissues, and is effective in enhancing the peritumoral edema—a characteristic feature of malignant gliomas. The FLAIR images used in this study are acquired from two different instances of the BRATS datasets, BRATS 2013, consisting of a total of 20 high grade (HG) and 10 low grade (LG) glioma cases, and BRATS 2015, consisting of 274 HG and 54 LG gliomas. Each of the datasets contain annotated ground truth delineated by clinical experts.

Figure (5) shows a representative segmentation for HG and LG gliomas along with the comparative results from other unsupervised algorithms. Although the there is some mismatch between the segmented and the ground truth regions for the presented algorithm, it is able to outperform the other unsupervised approaches. The overall performance of the presented algorithm is 72% and 76% on the Dice score for the 2013 and 2015 datasets, respectively. The 95% Hausdorff measure measured for the two separate instances are 16.08 and 14.05. The Dice score and the Hausdorff measure of all the methods including the supervised (automated as well as manual initialization) as well as unsupervised algorithms tested on the 2013 dataset are summarized in Figures 6(a&b). Clearly, our unsupervised approach outperforms most of the supervised algorithms, specifically in terms of the average values of Dice score and Hausdorff measure (black squares inside the box plot indicate the average).

Approach Dice score Author
Unsupervised min cut 75.610.5 Iquebal
Generative with shape prior 7719 Agn [44]
Generative-Discriminative 88.520 Bakas [44]
Expectation Maximization 68 Haeck [44]
Random Forests 84 Maier [44]
Random Forests 80.7 Malmi [44]
Conditional Random Fields 88.735 Meier [44]
Convolutional Neural Network 88 Havaei[44]
Convolutional Neural Network 81 Dvorak [44]
Convolutional Neural Network 86 Pereira [44]
Convolutional Neural Network 67 Rao [44]
Convolutional Neural Network 81.419.6 Vaidhya [44]
Convolutional Neural Network 87.556.72 Wang [13]
TABLE I: Dice score comparison for the BRATS 2015 dataset

For BRATS 2015 competition no test result was available at the time of writing, so we report only the training results of the algorithms reported in the MICCAI 2015 proceedings. This is shown in Table 1. Indeed, most of the deep learning based method outperform the performance of the our algorithm. However, it must be noted that these are the Dice scores on training dataset.

5.3 Defect concentration in additively manufactured components

In this case study, we explore a rather newly emerging field of materials and process discovery. Recent advances in the manufacturing technologies, specially additive manufacturing (AM, layer by layer deposition of metal powder to fabricate complex free-form surfaces) have revolutionized the landscape of fabricating industrial components and parts. With newer technologies, newer challenges arise. In the case of AM, although, it is capable of fabricating components with minimum time and material waste, overall functional integrity of AM components is considered much inferior to those realized with conventional manufacturing process chains, especially under real world dynamic loading conditions. Defects, such as pores, undiffused powder, geometric distortions, surface cracks and non-equilibrium microstructures significantly deteriorates the mechanical performance and overall functional integrity of the components.

Fig. 7: Comparative results of different algorithms tested for the segmentation of defects. (a) Original image for sample B (b) k-means with 2 clusters (c) Gaussian mixture model with expectation maximization (d) spatially constrained Gaussian mixture model with k-means initialization and (e) mean shift (f) the proposed method.

Developments of in situ imaging technologies allow monitoring of manufacturing processes at every conceivable resolution-of-interest. However, analysis of the images recorded from as-fabricated components to characterize the type and concentration of defects is not a trivial task. These microscopic images contain significant uncertainties in the shape and morphology, e.g., an undiffused metal powder appears very similar in shape to spherical pores. In addition, the signal to noise ratio of the images is typically very low and consists of under and over exposed regions. Figures 1(a&b) shows the morphology of defects formed during two different additive manufacturing processes. Only a few studies have been reported in the literature that utilizes in situ sensor data for the characterization of defects, e.g, see [45].

We test the performance of our approach on two surfaces that were fabricated using different process parameters in a laser based AM process called selective laser melting (SLM). The process parameters for the first surface (sample A, Figure 5(a)) was set as: laser power = 165 W, laser scanning speed = 138 mm/s and relative density = 99.5% and the second workpiece (sample B, Figure 6(a)) was: laser power = 85 W, laser scanning speed = 71 mm/s and relative density = 96.4%, respectively. All other process parameters, such as hatching distance (mm), layer thickness (mm), etc. were kept fixed.

In the current study we focus on the concentration of two different type of defects namely, balling effect and porosity. Balling effect is a complex metallurgical process that originates due to sub-optimal process parameters during the SLM process as well as the properties of the material powder. It causes the liquid scan track during SLM to break and result in the formation of spherical particles that eventually get trapped, causing inhomogeneous deposition of the powder in the next build layer. Authors in [30] showed that by increasing the laser power, the concentration of defects due to balling effect decreases. The second type of defects are the pores. Pores are the voids on surface that are formed when gas particles trapped in the melt-pool (liquefied metal during deposition) escapes. These are generally tracked by observing the dark spherical/oval shaped features on the surface. Controlling porosity as well as balling effect is critical avoid significant costs, as these defects may reduce the fatigue strength of material by more than 4 times, resulting in early, unexpected failure [46].

Figure 6(a) and 7(a) shows the representative surface from sample A and B, respectively. Clearly sample A with higher laser power (165 W) has relatively lower defect concentration as compared to sample B with low laser power (85 W). As the data size is limited, we compare our segmentation results only with that of other unsupervised segmentation methods, including k-means, Gaussian mixture model (GMM), and spatially constrained Gaussian mixture model (SCGMM) [1] and mean-shift [27]. Note that we ignored comparisons with graph cut approaches as some of the the existing implementations either required user input to define the foreground and background or did not converge. For the first sample, segmentation results are presented in Figures 6(b-f). The image contains significant noise due to the shadow and overexposure resulting due to uneven morphology of the AM surfaces, which makes the segmentation process challenging. The results obtained from the k-means as shown in Figure 6(b) contains a lot of noise. There is, however, some improvement when segmented with the GMM as shown in Figure 6(c). Similarly with SCGMM using k-means initialization is able to reduce the noise (see the white region, in Figure 6(d)). The segmentation from the mean shift algorithm (Figure 6(e)) is able to detect most of the pores but fails to defect the defect induced by the balling effect. The segmentation obtained with the proposed method is clearly able to identify all the regions of pores as well as the baling effect (Figure 6(f)).

Approach Dice score HausMeasure
Unsupervised min cut 70.56 32
mean shift 58.6 245
Gaussian mixture model 22.16 287
Spatially constrained GMM 2.6 219
k-means 1.4 321
TABLE II: Dice score and Hausdorff measure of various unsupervised approaches implemented for defect segmentation

In the second sample (B), we note that the noise is significantly higher as compared to sample A, mainly due to the rougher surface morphology. Three instances of balling effect was identified manually as shown by the arrows, along with multiple pores. Clearly, the segmentation results obtained from k-means (Figure 6(b)), GMM (Figure 6(c)) and SCGMM (Figure 6(d)) as well as mean shift (Figure 6(e)) all capture mostly the noise in the images. The segmentation obtained using the proposed method is clearly able to remove all noise and is selectively able to capture the defects (Figure 6(f)). However, for the case of balling effect, the algorithm is able to identify only two balling effects clearly. Nonetheless, the segmentation is a significant improvement over the standard state of the art methods.

Estimating defect concentration would enable understand the effect of different process parameters on the build quality, and therefore determine the optimal parameter settings. We verified the experimental observations reported by [30], that increase in the laser power results in less defect concentration. We validated this observation by estimating the proportion of defect area. The defect concentration on sample A is 1% and sample B is 4.39% as estimated from the segmented images, that is in accordance with the observations.

6 Conclusions

In the present work, we developed an graph coloring approach to segment noisy images using a capacitated continuous max-flow approach augmented with a maximum likelihood estimation of the flow capacities. We show that the maximum a posteriori estimate of the segmentation problem is analogous to the max-flow formulation given the flow capacities are known. The flow capacities are then estimated using the maximum likelihood estimation using a Markov random field prior over the neighborhood structure. The comparative results in an industrial setting established the performance of the present approach, as the method was able to selectively segment the defects ignoring the noise appearing due to shadow and presence of over/under exposed regions. Current and future works are focused on establishing the theoretical guarantees on the convergence rates and more informative priors.



  • [1] T. M. Nguyen and Q. J. Wu, “Fast and robust spatially constrained gaussian mixture model for image segmentation,” IEEE transactions on circuits and systems for video technology, vol. 23, no. 4, pp. 621–635, 2013.
  • [2] J. Shi and J. Malik, “Normalized cuts and image segmentation,” IEEE Transactions on pattern analysis and machine intelligence, vol. 22, no. 8, pp. 888–905, 2000.
  • [3] R. Zabih and V. Kolmogorov, “Spatially coherent clustering using graph cuts,” in Computer Vision and Pattern Recognition, 2004. CVPR 2004. Proceedings of the 2004 IEEE Computer Society Conference on, vol. 2.   IEEE, 2004, pp. II–II.
  • [4] F. Forbes and N. Peyrard, “Hidden markov random field model selection criteria based on mean field-like approximations,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 25, no. 9, pp. 1089–1101, 2003.
  • [5] H. Greenspan, G. Dvir, and Y. Rubner, “Context-dependent segmentation and matching in image databases,” Computer Vision and Image Understanding, vol. 93, no. 1, pp. 86–109, 2004.
  • [6] J.-S. Kim and K.-S. Hong, “Color–texture segmentation using unsupervised graph cuts,” Pattern Recognition, vol. 42, no. 5, pp. 735–750, 2009.
  • [7] Y. Y. Boykov and M.-P. Jolly, “Interactive graph cuts for optimal boundary & region segmentation of objects in nd images,” in Computer Vision, 2001. ICCV 2001. Proceedings. Eighth IEEE International Conference on, vol. 1.   IEEE, 2001, pp. 105–112.
  • [8] Y. Boykov, O. Veksler, and R. Zabih, “Fast approximate energy minimization via graph cuts,” IEEE Transactions on pattern analysis and machine intelligence, vol. 23, no. 11, pp. 1222–1239, 2001.
  • [9] T. F. Chan, S. Esedoglu, and M. Nikolova, “Algorithms for finding global minimizers of image segmentation and denoising models,” SIAM journal on applied mathematics, vol. 66, no. 5, pp. 1632–1648, 2006.
  • [10] S. Geman and D. Geman, “Stochastic relaxation, gibbs distributions, and the bayesian restoration of images,” IEEE Transactions on pattern analysis and machine intelligence, no. 6, pp. 721–741, 1984.
  • [11] M. Klodt, T. Schoenemann, K. Kolev, M. Schikora, and D. Cremers, “An experimental comparison of discrete and continuous shape optimization methods,” in European Conference on Computer Vision.   Springer, 2008, pp. 332–345.
  • [12] G. Strang, “Maximal flow through a domain,” Mathematical Programming, vol. 26, no. 2, pp. 123–143, 1983.
  • [13] G. Wang, M. A. Zuluaga, W. Li, R. Pratt, P. A. Patel, M. Aertsen, T. Doel, A. L. Divid, J. Deprest, S. Ourselin et al., “Deepigeos: a deep interactive geodesic framework for medical image segmentation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018.
  • [14] Y. Boykov and M.-P. Jolly, “Interactive organ segmentation using graph cuts,” in International conference on medical image computing and computer-assisted intervention.   Springer, 2000, pp. 276–286.
  • [15] C. Rother, V. Kolmogorov, and A. Blake, “Grabcut: Interactive foreground extraction using iterated graph cuts,” in ACM transactions on graphics (TOG), vol. 23, no. 3.   ACM, 2004, pp. 309–314.
  • [16] J. Xu, A. G. Schwing, and R. Urtasun, “Tell me what you see and i will show you where it is,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2014, pp. 3190–3197.
  • [17] T. Rohlfing, R. Brandt, R. Menzel, D. B. Russakoff, and C. R. Maurer, “Quo vadis, atlas-based segmentation?” in Handbook of biomedical image analysis.   Springer, 2005, pp. 435–486.
  • [18] L. M. Koch, M. Rajchl, W. Bai, C. F. Baumgartner, T. Tong, J. Passerat-Palmbach, P. Aljabar, and D. Rueckert, “Multi-atlas segmentation using partially annotated data: Methods and annotation strategies,” IEEE transactions on pattern analysis and machine intelligence, vol. 40, no. 7, pp. 1683–1696, 2018.
  • [19] J. Balachandran, L. Lin, J. S. Anchell, C. A. Bridges, and P. Ganesh, “Defect genome of cubic perovskites for fuel cell applications,” The Journal of Physical Chemistry C, vol. 121, no. 48, pp. 26 637–26 647, 2017.
  • [20] L. I. Rudin, S. Osher, and E. Fatemi, “Nonlinear total variation based noise removal algorithms,” Physica D: nonlinear phenomena, vol. 60, no. 1-4, pp. 259–268, 1992.
  • [21] J. Lie, M. Lysaker, and X.-C. Tai, “A binary level set model and some applications to mumford-shah image segmentation,” IEEE transactions on image processing, vol. 15, no. 5, pp. 1171–1181, 2006.
  • [22] ——, “A variant of the level set method and applications to image segmentation,” Mathematics of computation, vol. 75, no. 255, pp. 1155–1174, 2006.
  • [23] J. Yuan, E. Bae, X.-C. Tai, and Y. Boykov, “A continuous max-flow approach to potts model,” in European conference on computer vision.   Springer, 2010, pp. 379–392.
  • [24] W. Yang, L. Guo, T. Zhao, and G. Xiao, “Improving watersheds image segmentation method with graph theory,” in Industrial Electronics and Applications, 2007. ICIEA 2007. 2nd IEEE Conference on.   IEEE, 2007, pp. 2550–2553.
  • [25] T. Cour, F. Benezit, and J. Shi, “Spectral segmentation with multiscale graph decomposition,” in Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on, vol. 2.   IEEE, 2005, pp. 1124–1131.
  • [26] P. Arbelaez, M. Maire, C. Fowlkes, and J. Malik, “Contour detection and hierarchical image segmentation,” IEEE transactions on pattern analysis and machine intelligence, vol. 33, no. 5, pp. 898–916, 2011.
  • [27] D. Comaniciu and P. Meer, “Mean shift: A robust approach toward feature space analysis,” IEEE Transactions on pattern analysis and machine intelligence, vol. 24, no. 5, pp. 603–619, 2002.
  • [28] Y. Zhang, M. Brady, and S. Smith, “Segmentation of brain mr images through a hidden markov random field model and the expectation-maximization algorithm,” IEEE transactions on medical imaging, vol. 20, no. 1, pp. 45–57, 2001.
  • [29]

    B. Nadler and M. Galun, “Fundamental limitations of spectral clustering,” in

    Advances in neural information processing systems, 2007, pp. 1017–1024.
  • [30] H. Attar, M. Calin, L. Zhang, S. Scudino, and J. Eckert, “Manufacture by selective laser melting and mechanical behavior of commercially pure titanium,” Materials Science and Engineering: A, vol. 593, pp. 170–177, 2014.
  • [31] H. Gong, K. Rafi, H. Gu, T. Starr, and B. Stucker, “Analysis of defect generation in ti–6al–4v parts made using powder bed fusion additive manufacturing processes,” Additive Manufacturing, vol. 1, pp. 87–98, 2014.
  • [32] D. Mumford and J. Shah, “Optimal approximations by piecewise smooth functions and associated variational problems,” Communications on pure and applied mathematics, vol. 42, no. 5, pp. 577–685, 1989.
  • [33] Y. Boykov, O. Veksler, and R. Zabih, “Markov random fields with efficient approximations,” in Computer vision and pattern recognition, 1998. Proceedings. 1998 IEEE computer society conference on.   IEEE, 1998, pp. 648–655.
  • [34] E. Bae, J. Yuan, and X.-C. Tai, “Global minimization for continuous multiphase partitioning problems using a dual approach,” International journal of computer vision, vol. 92, no. 1, pp. 112–129, 2011.
  • [35] J. Lellmann, J. Kappes, J. Yuan, F. Becker, and C. Schnörr, “Convex multi-class image labeling by simplex-constrained total variation,” in International conference on scale space and variational methods in computer vision.   Springer, 2009, pp. 150–162.
  • [36] K. Wei, X.-C. Tai, T. F. Chan, and S. Leung, “Primal-dual method for continuous max-flow approaches,” in Computational Vision and Medical Image Processing V-Proceedings of 5th ECCOMAS Thematic Conference on Computational Vision and Medical Image Processing, VipIMAGE, vol. 2015, 2016, pp. 17–24.
  • [37] E. Giusti, “Minimal surfaces and functions of bounded variation,” Monogr. Math., vol. 80, 1984.
  • [38] S. Z. Li, Markov random field modeling in image analysis.   Springer Science & Business Media, 2009.
  • [39] J. Besag, “Spatial interaction and the statistical analysis of lattice systems,” Journal of the Royal Statistical Society. Series B (Methodological), pp. 192–236, 1974.
  • [40] K. Blekas, A. Likas, N. P. Galatsanos, and I. E. Lagaris, “A spatially constrained mixture model for image segmentation,” IEEE transactions on Neural Networks, vol. 16, no. 2, pp. 494–498, 2005.
  • [41] F. Comets, “On consistency of a class of estimators for exponential families of markov random fields on the lattice,” The Annals of Statistics, pp. 455–468, 1992.
  • [42] B. H. Menze, A. Jakab, S. Bauer, J. Kalpathy-Cramer, K. Farahani, J. Kirby, Y. Burren, N. Porz, J. Slotboom, R. Wiest et al., “The multimodal brain tumor image segmentation benchmark (brats),” IEEE transactions on medical imaging, vol. 34, no. 10, p. 1993, 2015.
  • [43] C. Carson, S. Belongie, H. Greenspan, and J. Malik, “Blobworld: Image segmentation using expectation-maximization and its application to image querying,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 8, pp. 1026–1038, 2002.
  • [44] B. Menze, M. Reyes, J. kalpathy Cramer, K. Farahani, S. Bakas, and C. Davatzikos, “Multimodal brain tumor image segmentation benchmark: ”change detection”,” in 2016 Proceedings of MICCAI conference on, 2015.
  • [45]

    C. Gobert, E. W. Reutzel, J. Petrich, A. R. Nassar, and S. Phoha, “Application of supervised machine learning for defect detection during metallic powder bed fusion additive manufacturing using high resolution imaging.”

    Additive Manufacturing, vol. 21, pp. 517–528, 2018.
  • [46] S. Tammas-Williams, P. Withers, I. Todd, and P. Prangnell, “The influence of porosity on fatigue crack initiation in additively manufactured titanium components,” Scientific reports, vol. 7, no. 1, p. 7308, 2017.