We study a general class of complex non-linear segmentation energies with high-order regional terms. Such energies are often desirable in the computer vision tasks of image segmentation, co-segmentation and stereo [15, 12, 22, 3, 2, 16, 14, 13] and are particularly useful when there is a prior knowledge about the appearance or the shape of an object being segmented.
We focus on energies of the following form:
where is a binary segmentation, is a standard length-based smoothness term, and is a (generally) non-linear regional functional discussed below.
Let be an image defined in . The most basic type of regional terms used in segmentation is a linear functional , which can be represented via an arbitrary scalar function
Usually, corresponds to an appearance model based on image intensities/colors , e.g., could be a log-likelihood ratio for intensity given particular object and background intensity distributions. The integral in (2) can be seen as a dot product between scalar function and
, which denotes the characteristic function of set. We use notation to refer to such linear functionals.
More general non-linear regional functional can be described as follows. Assume scalar functions , each defining a linear functional of type (2), and one differentiable non-linear function that combines them,
Such general regional terms could enforce non-linear constraints on volume or higher-order shape moments of segment. They could also penalize or other distance metrics between the distribution (or non-normalized bin counts) of intensities/colors inside segment and some given target. For example, could be softly constrained to specific volume via quadratic functional
using and , while the Kullback-Leibler (KL) divergence between the intensity distribution in and a fixed target distribution could be written as
Here scalar function is an indicator for pixels with intensity (in bin ). More examples of non-linear regional terms are discussed in Section 3.
In general, optimization of non-linear regional terms is NP-hard and cannot be addressed by standard global optimization methods. Some earlier papers developed specialized techniques for particular forms of non-linear regional functionals. For example, the algorithm in  was developed for minimizing the Bhattacharyya distance between distributions and a dual-decomposition approach in  applies to convex piece-wise linear regional functionals (3). These methods are outside the scope of our paper since we focus on more general techniques. Combinatorial techniques [15, 22] apply to general non-linear regional functionals, and they can make large moves by globally minimizing some approximating energy. However, as pointed out in , these methods are known to converge to solutions that are not even a local minimum of energy (1). Level sets are a well established in the litterature as a gradient descent framework that can address arbitrary differentiable functionals , and therefore, are widely used for high-order terms [1, 11, 4, 18, 12].
This paper compares two known optimization methods applicable to energy (1) with any high-order regional term (3): trust region and level sets. To address non-linear term , both methods use its first-order functional derivative
either when computing gradient flow for (1) in level sets or when approximating energy (1) in trust region. However, despite using the same first derivative , the level sets and trust region are fairly different approaches to optimize (1).
The structure of the paper is as follows. Sections 2.1-2.2 review the two general approaches that we compare: gradient descent implemented with level sets and trust region implemented with graph cuts. While the general trust region framework  can be based on a number of underlying global optimization techniques, we specifically choose a graph cut implementation (versus continuous convex relaxation approach), since it is more appropriate for our CPU-based evaluations. Our experimental results and comparisons are reported in Section 3 and Section 4 presents the conclusions.
2 Overview of Algorithms
2.1 Standard Level Sets
In the level set framework, minimization of energy
is carried out by computing a partial differential equation (PDE), which governs the evolution of the boundary of. To this end, we derive the Euler-Lagrange equation by embedding segment in a one-parameter family , , and solving a PDE of the general form:
where is an artificial time step parameterizing the descent direction. The basic idea is to describe segment implicitly via an embedding function :
Here, and denote the Dirac function and Heaviside function, respectively. Therefore, the evolution equation in (5) can be computed directly by applying the Euler-Lagrange descent equation with respect to . This gives the following gradient flow:
with denoting the curvature of ’s level lines. The first term in (8) is a regional flow minimizing , and the second is a standard curvature flow minimizing the length of the segment’s boundary.
In standard level set implementations, it is numerically mandatory to keep the evolving close to a distance function [17, 20]. This can be done by re-initialization procedures , which were intensively used in classical level set methods . Such procedures, however, rely on several ad hoc choices and may result in undesirable side effects . In our implementation, we use an efficient and well-known alternative , which adds an internal energy term that penalizes the deviation of from a distance function:
In comparison to re-initialization procedures, the implementation in  allows larger time steps (and therefore faster curve evolution). Furthermore, it can be implemented via simple finite difference schemes, unlike traditional level set implementations which require complex upwind schemes . With the distance-function penalty, the gradient flow in (8) becomes:
where is the discrete time step and is the iteration number. is a numerical approximation of the right-hand side of (10), where the spatial derivatives of are approximated with central differences and the temporal derivative with forward differences. The Dirac function is approximated by for and elsewhere. We use and .
In the context of level set and PDE methods, it is known that the choice of time steps should follow strict numerical conditions to ensure stability of front propagation, e.g., the standard Courant-Friedrichs-Lewy (CFL) conditions . These conditions require that should be smaller than a certain value that depends on the choice of discretization. The level set literature generally uses fixed time steps. For instance, classical upwind schemes  generally require a small for stability, whereas the scheme in  allows relatively larger time steps. The optimum time step is not known a priori and finding a good via an adaptive search such as back-tracking  seems attractive. However, to apply a back-tracking scheme, we would have to evaluate the energy at each step. In the case of level sets, this requires a discrete approximation of the original continuous energy. We observed in our experiments that the gradient of such discrete approximation of the energy does not coincide with the gradient obtained in the numerical updates in (11). Therefore, with a back-tracking scheme, level sets get stuck very quickly in a local minimum of the discrete approximation of the energy (See the adaptive level set example in Fig. 2). We believe that this is the main reason why, to the best of our knowledge, back-tracking approaches are generally avoided in the level-set literature. Therefore, in the following level-set experiments, we use a standard scheme based on fixed time step during each curve evolution, and report the performance at convergence for several values .
2.2 Trust Region Framework
Trust region methods are a class of iterative optimization algorithms. In each iteration, an approximate model of the optimization problem is constructed near the current solution. The model is only “trusted” within a small region around the current solution called “trust region”, since in general, approximations fit the original non-linear function only locally. The approximate model is then globally optimized within the trust region to obtain a candidate iterate solution. This step is often called trust region sub-problem. The size of the trust region is adjusted in each iteration based on the quality of the current approximation. Variants of trust region approach differ in the kind of approximate model used, optimizer for the trust region sub-problem step and a protocol to adjust the next trust region size. For a detailed review of trust region methods see .
Below we outline a general version of a trust region algorithm in the context of image segmentation. The goal is to minimize in Eq. (1). Given solution and distance , the energy is approximated using
where is the first order Taylor approximation of the non-linear term near . The trust region sub-problem is then solved by minimizing within the region given by . Namely,
Once a candidate solution is obtained, the quality of the approximation is measured using the ratio between the actual and predicted reduction in energy. The trust region is then adjusted accordingly.
For the purpose of our CPU-based evaluations we specifically selected the Fast Trust Region (FTR) implementation  which includes the following components for the trust region framework. The non-linear term is approximated by the first order Taylor approximation in (12) using first-order functional derivative (4). The trust region sub-problem in (13) is formulated as unconstrained Lagrangian optimization, which is globally optimized using one graph-cut (we use a floating point precision in the standard code for graph-cuts ). Note that, in this case, the length term is approximated using Cauchy-Crofton formula as in . More details about FTR can be found in .
2.3 Conceptual Comparison
Some high-level conceptual differences between the level sets and trust region optimization frameworks are summarized in Figure 1. Standard level sets methods use fixed to make steps in the gradient descent direction. Trust region algorithm  moves to solution minimizing approximating functional within a circle of given size . The trust region size is adaptively changed from iteration to iteration based on the observed approximation quality. The blue line illustrates the spectrum of trust region moves for all values of . Solution is the global minimum of approximation . For example, if is a 2nd-order Taylor approximation of at point then would correspond to a Newton’s step.
3 Experimental Comparison
In this section, we compare trust region and level sets frameworks in terms of practical efficiency, robustness and optimality. We selected several examples of segmentation energies with non-linear regional constraints. These include: 1) quadratic volume constraint, 2) shape prior in the form of distance between the target and the observed shape moments and 3) appearance prior in the form of either
distance, Kullback-Leibler divergence or Bhattacharyya distance between the target and the observed color distributions.
In all the experiments below, we optimize energy of a general form . To compare optimization quality, we will plot energy values for the results of both level sets and trust region. Note that the Fast Trust Region (FTR) implementation uses a discrete formulation based on graph-cuts and level sets are a continuous framework. Thus, the direct comparison of their corresponding energy values should be done carefully. While numerical evaluation of the regional term is equivalent in both methods, they use completely different numerical approaches to measuring length . In particular, level sets use the approximation of length given in (2.1), while the graph cut variant of trust region relies on integral geometry and Cauchy-Crofton formula popularized by .
Since the energies are not comparable by its actual number, we study instead the robustness of each method independently and compare the resulting segmentation with one another. Note that level sets have much smaller oscillation for small time steps, which supports the theory of the CFL-conditions.
In each application below we examine the robustness of both trust region and level sets methods by varying the running parameters. In the trust region method we vary the multiplier used to change the size of the trust region from one iteration to another. For the rest of the parameters we follow the recommendations of . In our implementation of level sets we vary the time-step size , but keep the parameters and fixed for all the experiments.
The top-left plots in figures 2-6 report energy as a function of the CPU time. At the end of each iteration, both level sets and trust region require energy updates, which could be computationally expensive. For example, appearance-based regional functionals require re-evaluation of color histograms/distributions at each iteration. This is a time consuming step. Therefore, for completeness of our comparison, we report in the top-middle plots of each figure energy versus the number of energy evaluations (number of updates) required during the optimization.
3.1 Volume Constraint
First, we perform image segmentation with a volume constraint with respect to a target volume , namely,
We choose to optimize this energy on a synthetic image without appearance term since the solution to this problem is known to be a circle. Figure 2 shows that both FTR and level sets converge to good solutions (nearly circle), with FTR being 25 times faster, requiring 150 times less energy updates and exhibiting more robustness to the parameters.
3.2 Shape Prior with Geometric Shape Moments
Next, we perform image segmentation with a shape prior constraint in the form of distance between the geometric shape moments of the segment and a target. Our energy is defined as , where is a standard log-likelihood unary term based on intensity histograms. In this case, is given by
with denoting the target geometric moment of order . Figure 3 shows an example of liver segmentation with the above shape prior constraint. The target shape moments as well as the foreground and background appearance models are computed from the user provided input ellipse as in [16, 13]. We used moments of up to order (including the center of mass and shape covariance but excluding the volume). Both trust region and level sets obtain visually pleasing solutions. The trust region method is two orders of magnitude faster and requires two orders of magnitude less energy updates (top-left and top-middle plots). Since the level sets method was forced to stop after 10000 iterations, we show the last solution available for each value of parameter . The actual convergence for this method would have taken more iterations. In this example, the oscillations of the energy are especially pronounced (top-right plot).
3.3 Appearance Prior
In the experiments below, we apply both methods to optimize segmentation energies where the goal is to match a given target appearance distribution using either the distance between the observed and target color bin counts, or the Kullback-Leibler divergence and Bhattacharyya distance between the observed and target color distributions. Here, our energy is defined as . We assume is an indicator function of pixels belonging to bin and
is the target count (or probability) for bin. The target appearance distributions for the object and the background were obtained from the ground truth segments. We used 100 bins per color channel. The images in the experiments below are taken from .
distance constraint on bin counts:
Figure 4 shows results of segmentation with distance constraint between the observed and target bin counts regularized by length. The regional term in this case is
Since the level sets method was forced to stop after 15000 iterations for values of , we show the last solution available. Full convergence would have taken more iterations. For higher values of , we show results at convergence. We observe two orders of magnitude difference between the trust region and level sets method in terms of the speed and the number of energy updates required.
Figure 5 shows results of segmentation with KL divergence constraint between the observed and target color distributions. The regional term in this case is given by
The level sets method converged for times steps and , but was forced to stop for other values of the parameter after iterations. We show the last solution available. Full convergence would have required more iterations. The trust region method obtains solutions that are closer to the ground truth, runs two order of magnitude faster and requires less energy updates.
Figure 6 shows results of segmentation with Bhattacharyya distance constraint between the observed and target color distributions. The regional term in this case is given by
Also for this image, since the level sets method had not (yet) converged after 10000 iterations for any set of the parameters, we show the last solution available. Further increasing parameter would increase the oscillations of the energy (see top-right plot).
For relatively simple functionals (3), combining a few linear terms ( is small), such as constraints on volume and low-order shape moments, the quality of the results obtained by both methods is comparable (visually and energy-wise). However, we observe that the number of energy updates required for level sets is two orders of magnitude larger than for trust region. This behavior is consistent with the corresponding CPU running time plots. The segmentation results on shape moments were fairly robust with respect to parameters (time step and multiplier ) for both methods. The level sets results for volume constraints varied with the choice of . In general, larger steps caused significant oscillations of the energy in level sets thereby affecting the quality of the result at convergence.
When optimizing appearance-based regional functionals with large number of histogram bins (corresponding to large ), level sets proved to be extremely slow. Convergence would require more than iterations (longer than 1 hour on our machine). In some cases, the corresponding results were far from optimal both visually and energy-wise. This is in contrast to the results by trust region approach, which consistently converged to plausible solutions with low energy in less than a minute or iterations.
We believe that our results will be useful for many practitioners in computer vision and medical imaging when selecting an optimization technique.
-  A. Adam, R. Kimmel, and E. Rivlin. On scene segmentation and histograms-based curve evolution. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 31:1708–1714, 2009.
I. Ben Ayed, H. Chen, K. Punithakumar, I. Ross, and S. Li.
Graph cut segmentation with a global constraint: Recovering region
distribution via a bound of the Bhattacharyya measure.
IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pages 3288 – 3295, 2010.
-  I. Ben Ayed, S. Li, A. Islam, G. Garvin, , and R. Chhem. Area prior constrained level set evolution for medical image segmentation. In Proc. SPIE, Medical Imaging 2008: Image Processing, 2008.
-  I. Ben Ayed, S. Li, and I. Ross. A statistical overlap prior for variational image segmentation. International Journal of Computer Vision, 85(1):115–132, 2009.
-  S. Boyd and L. Vandenberghe. Convex Optimization. Cambridge University Press, 2004.
-  Y. Boykov and V. Kolmogorov. Computing Geodesics and Minimal Surfaces via Graph Cuts. In IEEE Int. Conf. on Computer Vision (ICCV), 2003.
-  Y. Boykov and V. Kolmogorov. An Experimental Comparison of Min-Cut/Max-Flow Algorithms for Energy Minimization in Vision. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 29(9):1124–1137, 2004.
-  V. Caselles, R. Kimmel, and G. Sapiro. Geodesic active contours. International Journal of Computer Vision, 22(1):61–79, 1997.
-  T. Chan and L. Vese. Active contours without edges. IEEE Transactions on Image Processing (TIP), 10(2):266–277, 2001.
-  V. Estellers, D. Zosso, R. Lai, S. Osher, J. P. Thiran, and X. Bresson. Efficient algorithm for level set method preserving distance function. IEEE Transactions on Image Processing (TIP), 21(12):4722–34, 2012.
-  A. Foulonneau, P. Charbonnier, and F. Heitz. Multi-reference shape priors for active contours. International Journal of Computer Vision, 81(1):68–81, 2009.
-  D. Freedman and T. Zhang. Active contours for tracking distributions. IEEE Transactions on Image Processing (TIP), 13:518 – 526, April 2004.
-  L. Gorelick, F. Schmidt, and Y. Boykov. Fast Trust Region for Segmentation. In IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), June 2013.
-  L. Gorelick, R. Schmidt, Y. Boykov, A. Delong, and A. Ward. Segmentation with Non-Linear Regional Constraint via Line-Search cuts. In European Conf. on Computer Vision (ECCV), 2012.
-  J. Kim, V. Kolmogorov, and R. Zabih. Visual correspondence using energy minimization and mutual information. In IEEE Int. Conf. on Computer Vision (ICCV), pages 1033 – 1040, 2003.
-  M. Klodt and D. Cremers. A convex framework for image segmentation with moment constraints. In IEEE Int. Conf. on Computer Vision (ICCV), 2011.
-  C. Li, C. Xu, C. Gui, and M. D. Fox. Level set evolution without re-initialization: A new variational formulation. In IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pages 430–436, 2005.
-  O. V. Michailovich, Y. Rathi, and A. Tannenbaum. Image segmentation using active contours driven by the bhattacharyya gradient flow. IEEE Transactions on Image Processing (TIP), 16(11):2787–2801, 2007.
-  A. Mitiche and I. Ben Ayed. Variational and Level Set Methods in Image Segmentation. Springer, 2011.
-  S. Osher and R. Fedkiw. Level Set Methods and Dynamic Implicit Surfaces. Springer-Verlag, 2002.
-  C. Rother, V. Kolmogorov, and A. Blake. GrabCut: Interactive Foreground Extraction using Iterated Graph Cuts. In ACM SIGGRAPH, 2004.
-  C. Rother, V. Kolmogorov, T. Minka, and A. Blake. Cosegmentation of Image Pairs by Histogram Matching - Incorporating a Global Constraint into MRFs. In IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pages 993 – 1000, 2006.
-  J. A. Sethian. Level set Methods and Fast Marching Methods. Cambridge University Press, 1999.
-  O. J. Woodford, C. Rother, and V. Kolmogorov. A global perspective on MAP inference for low-level vision. In IEEE Int. Conf. on Computer Vision (ICCV), pages 2319 – 2326, 2009.
-  Y. Yuan. A review of trust region algorithms for optimization. In International Congress on Industrial and Applied Mathematics, 1999.