## 1 Introduction

The phase field of Ambrosio and Tortorelli [1] (AT function) serving as a continuous indicator for the boundary/not-boundary state at every domain point has proven to be an indispensable tool in image and shape analysis. It is a minimizer of an energy composed of two competing terms: One term favors configurations that take values close to either or (separation into boundary/not-boundary phases) and the other term encourages local interaction in the domain by penalizing spatial inhomogeneity. A parameter controls the relative influence of these two terms, hence, the interaction. As this ”interaction” parameter tends to

, the separation term is strongly emphasized; consequently, the field tends to the characteristic function

of the boundary set and the AT energy tends (following the convergence framework [4]) to the boundary length.In computer vision, the AT function first appeared as a technical device to apply gradient descent to the Mumford-Shah functional

[15]. Over the years, it has been extended in numerous ways to address a rich variety of visual applications. Earlier works include Shah and colleagues [22, 23, 25, 19], March and Dozio [13], Proesman, Pauwels and van Gool [21], Teboul et al. [29]. During the last couple of years we have witnessed an increasing number of promising works modifying or extending Ambrosio-Tortorelli/Mumford-Shah based models. Some examples are: Bar, Sochen and Kiryati [3], Rumpf and colleagues [6, 20], Erdem, Sancar-Yilmaz and Tari [7], Patz and Preusser [17], Jung and Vese [9]. These works together with many others collaboratively established the role of AT function in variational formulations that jointly involve region and boundary terms.In the majority of the works, the AT function serves as an auxiliary variable to facilitate discontinuity-preserving smoothing and boundary detection. Relatedly, the interaction parameter is chosen sufficiently small to better localize boundaries. In contrast, Shah and Tari, starting with [27, 28] in late 90’s, have focused on the ability of the AT function in coding morphologic properties of shapes, regions construed by boundaries. Relatedly, they have weakened boundary/not-boundary separation either by choosing a large interaction parameter or by other means [2] and focused on the geometric properties of the level curves after constructing the AT function (reviewed in [24]) for shapes as:

(1) |

where is a bounded open set with a boundary (denoting a shape); is the shape indicator function which attains in and on ; is the parameter. The first term forces strong boundary/interior separation while the second one forces smoothness.

The AT function of shape is related to a variety of morphological concepts. For instance, it is a weighted distance transform [11, 12] with its level curves approximating curvature-dependent motion [16, 10]. Thus, it enables extraction of local symmetries and skeletons directly from grayscale images; that is, it bridges image segmentation and shape description. The ability of level curves in coding morphological information is also exploited by Droske and Rumpf [20] to measure equivalence of two shapes in a registration problem.

In this paper, following Shah and Tari [27, 28, 2], we explore and extend the ability of an AT-like field in coding features that are at a higher level than boundaries. Whereas the previous works focus on local symmetry axes, we focus on shape’s intuitive components as coded via upper and lower level sets. Our constructions are based on a new field obtained as the minimizer of a modified AT energy. We discuss the geometry of the level curves of the new minimizer and exploit it to extract a part hierarchy tree endowed with a probabilistic structure.

The considered modification involves an additive augmentation of the interaction term with a non-local term in a way that the upper and lower zero level sets of the minimizer yield disjoint domains [26] within which the minimizer is morphologically equivalent to the AT function. Following the pioneering work of Buades, Coll and Morel [5], UCLA group formulated interesting non-local variational formulations, including non-local versions of the Ambrosio-Tortorelli/ Shah approximations of the Mumford-Shah functional [9, 8] by replacing local image derivatives with non-local ones. This kind of modification is very different from our modification which modifies the phase field itself.

In this paper, we focus on shapes. Nevertheless, the long term goal of our work is to bridge low level processes such as segmentation and image registration with the high level process of shape abstraction. Integration of the presented developments to Mumford-Shah type models via coupled PDEs framework is a future work.

## 2 A Modified Energy and Its Minimizer

Let us consider

(2) |

where is the expectation of given by and is the distance transform. The new energy to be minimized is composed of three terms and obtained by modifying the AT energy in (1) in two aspects.

Firstly, the interaction term of (1) is additively augmented with . This new term forces the minimizer to acquire a low average value with the average being computed over the entire domain. At a first glance, this seems to favor spatial homogeneity by forcing the minimizer to attain values close to zero. Yet, the minimum of is also reached when oscillates, that is, when it attains both negative and positive values adding up to . In this respect, the third term is a separation term partitioning into subdomains of opposing signs. Due to the influence of the term which penalizes spatial inhomogeneities, locations of identical sign tend to form spatial groups. Obviously, the minimizer of subject to homogenous Dirichlet boundary condition is the flat function unless accompanied by an external inhomogeneity.

Indeed, the purpose of the second modification is to influence spatial grouping of positive and negative values of in a particular way that the sign change separates the gross structure from the boundary detail. In particular, the upper zero level set covers central regions whereas the lower zero level set covers peripheral regions containing limbs, protrusions and boundary texture or noise. Towards this end, the indicator is replaced by a weighted indicator that is a monotonically increasing function of the shortest distance to the boundary, namely, the distance transform. As before, the first term favors separation of the domain into phases; however, the phases are the level curves of the distance transform. Since, however, the level curves of the AT function in (1) are equivalent to the level curves of a smooth distance transform [28], this change merely scales without qualitatively affecting the geometry of its level curves. Nevertheless, when the terms considered together, the minimizer tends to have positive values at central locations and negative values at peripheral locations because the penalty incurred by assigning negative values to central locations with higher positive values is higher than the penalty incurred by assigning negative values to locations with lower values.

Similar to the AT function, the new minimizer is a compromise between inhomogeneity and homogeneity though the inhomogeneity is forced both externally (by ) and internally (by the third term); or it is the best approximation of an external inhomogeneity subject to internal constraints.

The parameter should be chosen large enough so that the attachment to the external inhomogeneity should not dominate over the tendency to interact. Indeed, in the absence of the third term, a good practice is to chose at least on the order of the maximum thickness for the diffusive effect of to influence the entire shape ([2]; Fig. 1 in [28]). The same argument also holds here since the effect of the third term is to partition into subdomains within which is morphologically similar to the AT function. Additionally, notice that the expression responsible for sign change, , has been already normalized by . As such, should be larger than .

In Fig. 1 (a), an illustration for a 1-D case is given: is plotted for four different values of ranging between and . Naturally, gets flatter as increases. (The flattening can be avoided by scaling either or .) Nevertheless, the locations of the extrema and the zero crossings remain the same unless is significantly smaller than . Similarly in 2-D, the geometry of the level curves is stable as long as is chosen suitably large. Illustrative level curves are depicted in Fig. 1 (b-c). Absolute values of separately normalized within regions of identical sign are used for convenience of color visualization. Zero level curves separate central and peripheral structures in the form of upper and lower zero level sets: and

. The peripheral structure includes all the detail: limbs, protrusions, and boundary texture or noise. In contrast, the central structure is a very coarse blob-like form; it can even be thought as an interval estimate of the center whereas the centroid is the point estimate.

(a) | (b) | (c) |

Most commonly, is a simply connected set. Of course, it may also be either disconnected or multiply connected. For instance, it is disconnected for a dumbbell-like shape (two blobs of comparable radii combined through a thin neck) whereas it is multiply connected for an annulus formed by two concentric circles. Indeed, the annulus gets split into three concentric rings where the middle ring is the . For quite a many shapes, however, is a simply connected set.

Firstly, shapes obtained by protruding a blob as well as shapes whose peripheral parts are smaller or thinner than their main parts always have a simply connected . This is expected: When the width of a part is small, the highest value of inside the part is small. That is, the local contribution to incurring due to negative values is less significant for such a part as compared to locations with higher positive values of . Consequently, tends to attain negative values on narrow or small parts as well as on protrusions. Shapes with holes also have a simply connected as long as the holes are far from the center.

Secondly, even a dumbbell-like shape may have a simply connected . This happens if the join area, namely, the neck is wide enough. Nevertheless, this does not cause any representational instability: Whereas the for a blob-like shape has a unique maximum located roughly at its centroid, the for a dumbbell-like shape has two local maxima indicating two bodies. Each body is captured by a connected component of an upper level set whose bounding curve passes through a saddle point. At a saddle point , the curve has a double point singularity, i.e. it forms a cross. As such, the upper level set yields two disjoint connected components capturing the two parts of the central structure.

In contrast to , the peripheral structure is often multiply connected. Indeed, its hole(s) are carved by . It is also possible that is disconnected. For instance, for an annulus, it is two concentric rings. Additionally, may be disconnected when there are several elongated limbs organized around a rather small central body, e.g., a palm tree. , being small, is tolerated to grow and reach to the most concave parts of the shape boundary creating a split of by the zero-level curve. Similar to those in , the level curves in that are passing through saddle points provide further partitioning. The partitions are in the form of lower level sets .

To sum up, within both and , nested open sets (upper level sets inside and lower level sets inside ) characterize the domain. The level curves bounding the level sets are either closed curves or closed curves with crossing points. The ones with crossing points are of particular interest because the respective level set is partitioned at those points into two distinct connected components. A crossing of a level curve occurs at a saddle point of . Of course, each lower level set may contain other saddle points. Consequently, the partitioning is binary and iterative and determined by the order of saddle points.

It is not generically possible that a level curve has singular points of higher order because such singular points are unstable and may be removed by a slight change in . It is also highly unlikely that a connected component of an curve has two distinct crossing points. This issue is tackled in §3 via randomization.

## 3 Randomized Hierarchy Tree

Since the partitioning inside both the and of a shape are iterative and binary, the parts can be organized starting from the second level in the form of a proper binary tree. Let the shape be the root node and its children be the upper and lower zero level sets, namely, the disjoint regions of the central and peripheral structures. Suppose the central and peripheral structures are respectively composed of and disjoint sets. Let us enumerate the nodes holding these sets as for the and as for the . This is the second level of the tree and the first level of the partitioning. Of course, the root may have more than two children. Nevertheless, starting from the children of the root, each subtree is a proper binary tree because all the splits inside an or occur at saddle points; that is, each connected component of the second level and its children either get split into two level sets or remain as they are. We call this hierarchical organization as the Initial Part Tree. A hypothetical initial part tree is illustrated in Fig. 2 (a). In a real example, the nodes hold application dependently selected attributes of the respective level sets.

Binary splits according to saddle points produce collections of parts which are at the leaf level consistent across visual changes. However, the hierarchical order and granularity of parts are not necessarily consistent. For instance, a weak saddle is easily removed when the shape is slightly smoothed. Likewise, certain non-generic configurations such as level curves with spatially distinct saddle point singularities or triple point singularities cannot occur; indeed, such configurations are easily replaced by one of the corresponding generic configurations which may differ for similar shapes. Furthermore, when a shape is occluded by another shape, added peripheral parts change the positions of some of the previous level sets in the hierarchy. Nevertheless, the relative values of at saddle points prompting consecutive splits are stable indicators of the organizational hierarchy. Of course, attempting to convert a saddle point value to a tree depth by discretization brings back the previous robustness issue.

Instead, we use the difference between the values of two successive saddle points as a measure of saliency for the partitioning prompted by the latter saddle point. Converting the saliency measure to a probability measure and considering probability measures for all nodes, we endow the initial part tree with a random structure from which possible re-organizations of the initial hierarchy tree are to be sampled. We call the new structure as the

Randomized Part Hierarchy Tree. Below, we give the details of the randomization procedure. In contrast to the respective initial part tree, a random sample from a randomized part hierarchy tree is not necessarily a proper binary tree.The randomization starts from level nodes and propagates through their children. Recall that this is the first level of nodes that are created via saddle points. For each pair of siblings, there are two possible events: The pair of siblings either maintain their depth (no change in the local tree structure) or inherit the depth of their parent (change in the local tree structure). In the latter case, the node and its sibling replace their parent and become the children of their grandparent. The probabilities of the two events are derived from a quantity which we denote by . It is a property of a split meaning that the values of a pair of siblings are identical. Specifically, it is the difference between the saddle point values of a node and its parent divided by the saddle point value of the node. Because the magnitude of the saddle point value of a node is always greater than that of its parent, ; the equality is attained at level .

A small value of implies that the consecutive saddle points are closer in value; that is, a slight change in their value changes their order hence the local tree structure. Equivalently a large value of implies that the consecutive saddle points are well separated, therefore, the local structure is stable. We require that the probability that a local structure change is necessary approaches as approaches to the smallest possible value which is . Equivalently, should approach as approaches to its largest possible value. The function is a good candidate for estimating ; there is less than chance for reorganization since for the largest possible .

(a) (b) (c) (d) |

Let us consider the initial part tree in Fig. 2 (a). Assume that for nodes and . With probability , the local structure is preserved, whereas with probability nodes and replace their parent and become children of their grandparent, the root. Assume that for and . Then with probability , the local structure is preserved, while with probability nodes and replace their parent and become children of their grandparent which is either node with or the root with . Thus, there are four possible organizations: With probability , the entire structure is preserved. With probability , the tree is re-organized as in (b). With probability , the tree is re-organized as in (c). With probability , the tree is re-organized as in (d).

## 4 Experimental Results and Discussion

To evaluate the effectiveness of endowing the part hierarchy tree with a probabilistic structure, we consider a pairwise matching problem. It is formulated as finding a maximal clique in the joint association graph of the pair of trees to be matched, e.g. [18]. At each experiment, each of the two randomized part hierarchy trees is independently sampled several times and then all the sample pairs are matched. The trees in the sample pair that yields the highest matching score are called as the winning re-organizations.

Depending on the application, various properties related to the level sets stored at nodes can be used as node attributes. For instance, we extract parts enclosing each of the stored level sets and then use their area and the maximum values inside them as attributes. We remark that the maximum value of is related to the part width for the finest parts. Because boundaries of stored level sets pass through saddle points, an enclosing part is easily obtained as a morphologic watershed zone, whose seed is the respective level set [14]. Both to keep illustrations simple and resource requirements low, we require that each part to neighbor the central structure and each seed and part to have a certain size. Specifically, splits are performed only through the saddle points that reside on the boundaries of the watershed zones that are touching to the closure of the central structure and any split producing a seed that is less than of the shape or a part that is less than of the shape is ignored.

We present four illustrative examples. In each case, correct associations are found despite several order and granularity inconsistencies resulting from occluders, non-generic splits and weak saddles.

The first example is a matching between a human silhouette and its occluded version. The winning re-organizations for each of the two trees are shown in Fig. 3

. At each node, the watershed (enclosing part) is depicted as dark gray and the respective level set (seed part) depicted as black is superimposed on the part; the neighboring part (the light gray) is also shown even though it is not used for the matching. Due to page limits, we cannot provide the initial trees and probability distributions, but the numbering of the nodes already reveals the structure of the initial binary tree.

Firstly, notice that the arms on the left and on the right reside at different levels in both of the initial trees as revealed by their respective five versus four digit node numbers. Ideally, the almost symmetric upper bodies (nodes ) should contain two distinct saddle points and such that ; that is, two distinct saddle points on a single curve should simultaneously yield the three nodes: (arms) and (head). However, certain configurations including this one are not generic; even the slightest perturbation imposes a strict order on the saddle points. Thus, firstly, the combination of the head and either one of the arms is separated from the other arm then the head-arm combination is partitioned. Nevertheless, in each case, the saddle point value separating the head and arm on the left combination from the arm on the right is very close to the saddle point value separating the head from the arm on the left; e.g., for the first shape, the respective saddle point values after normalization with respect to the global maximum of are and while the saddle point value separating the entire upper body from the entire lower body is . Clearly, the hierarchical order between the upper body and its children is more stronger than the hierarchical order among its children. We remark that even though the arm re-organization is not necessary for finding correct part correspondences since the structures of the upper bodies are already the same for the two initial trees, the left and right arms are brought to the same level. This is because the probabilities of retaining the initial binary local structures is very low due to the closeness of the consecutive saddle point values.

Secondly, notice that the legs of the occluded figure are at the sixth level whereas the legs of the un-occluded one are at the fourth level, as revealed by their node numbers. This is due to the influence of two additional parts (watershed regions) belonging to the occluder and poses a challenge for the tree matching. Nevertheless, the legs are brought to the same level as well as all of the corresponding parts and correct associations are found: (central regions), (upper bodies), (arms on the left), (heads), (arms on the right), (lower bodies) (legs on the left) (legs on the right).

In the next three examples, due to limited space, only the matchings where at least one member of the matching pair is a leaf are depicted even though entire hierarchical structures are matched. Non-leaf nodes are circled. These examples also demonstrate the necessity of not restricting the correspondence search to leaf nodes. The matching between a cat and a horse in Fig. 4 illustrates a granularity inconsistency. Due to a weak saddle marked by the arrow in the left, the front legs of the horse are not further partitioned. Nevertheless, this inconsistency is resolved by matching the respective leaf node of the horse tree to a non-leaf node of the cat tree, the parent of the two nodes each holding a front leg of the cat.

Fig. 5 (a) depicts the matching of the same horse to another horse. In addition to the previous granularity inconsistency, there are several order inconsistencies which are not noticeable at the leaf level presentation. For instance, the rear body of the first horse firstly splits into the fourth leg and tail versus the third leg, and then the fourth leg is separated from the tail. On the other hand, the rear body of the second horse after a spurious division gets splits into the rear legs versus the tail, and then the two legs are separated. Consequently, a two level difference between the third leg of the first horse (node ) and the third leg of the second horse (node ) is formed. Despite both granularity and order inconsistencies, all of the parts are correctly matched.

(a) | (b) |

The final example (Fig. 5 (b)) involves several difficulties due to three weak saddles resulting with three unintuitive partitions for the second cat: Firstly, its head is fragmented; secondly, its rear body goes through a spurious division causing an erroneous shift in the levels of its sub-parts; thirdly, its fourth leg and tail are not separated. Nevertheless, the selected clique contains all of the correct associations. The rear body and its parts for the second cat are properly lifted one level up; consequently, the correct associations of the parts of the rear bodies are found successfully. The head of the first cat matches to the parent of the two leaves holding two unintuitive parts of the head of the second cat. The two head fragments of the second cat as well as the fourth leg and the tail of the first cat are correctly excluded from the selected clique as there are no corresponding parts in the other tree.

### Acknowledgements

ST is supported by the Alexander von Humboldt Foundation. She extends her gratitude to the members of Sci. Comp. Dept. of Tech. Universität München, in particular to Folkmar Bornemann, for providing a wonderful sabbatical stay in every respect. MG is funded as MS student via Tübitak project 108E015 to ST.

## References

- [1] L. Ambrosio and V. Tortorelli. On the approximation of functionals depending on jumps by elliptic functionals via -convergence. Commun. Pure Appl. Math., 43(8):999–1036, 1990.
- [2] C. Aslan and S. Tari. An axis-based representation for recognition. In ICCV, pages 1339–1346, 2005.
- [3] L. Bar, N. Sochen, and N. Kiryati. Image deblurring in the presence of impulsive noise. Int. J. Comput. Vision, 70(3):279–298, 2006.
- [4] A. Braides. Approximation of Free-discontinuity Problems. Lecture Notes in Mathematics, Vol. 1694. Springer-Verlag, 1998.
- [5] A. Buades, B. Coll, and J. M. Morel. A non-local algorithm for image denoising. In CVPR, pages 60–65, 2005.
- [6] M. Droske and M. Rumpf. Multi scale joint segmentation and registration of image morphology. IEEE T-PAMI, 29(12):2181–2194, 2007.
- [7] E. Erdem, A. Sancar-Yilmaz, and S. Tari. Mumford-Shah regularizer with spatial coherence. In SSVM, pages 545–555. Springer-Verlag, 2007.
- [8] M. Jung, X. Bresson, T. Chan, and L. Vese. Color image restoration using nonlocal Mumford-Shah regularizers. In EMMCVPR, pages 373–387. Springer-Verlag, 2009.
- [9] M. Jung and L. Vese. Nonlocal variational image deblurring models in the presence of gaussian or impulse noise. In SSVM, pages 401–412. Springer-Verlag, 2009.
- [10] B. Kimia, A. Tannenbaum, and S. Zucker. Shapes, shocks, and deformations I: The components of two-dimensional shape and the reaction-diffusion space. Int. J. Comput. Vision, 15(3):189–224, 1995.
- [11] R. Kimmel, N. Kiryati, and A. Bruckstein. Sub-pixel distance maps and weighted distance transforms. J. of Math. Imag. and Vis., 6(2-3):223–233, 1996.
- [12] P. Maragos and M. A. Butt. Curve evolution, differential morphology and distance transforms as applied to multiscale and eikonal problems. Fundamentae Informatica, 41:91–129, 2000.
- [13] R. March and M. Dozio. A variational method for the recovery of smooth boundaries. Image and Vision Computing, 15(9):705 – 712, 1997.
- [14] F. Meyer. Topographic distance and watershed lines. Signal Processing, 38:113–125, 1994.
- [15] D. Mumford and J. Shah. Optimal approximations by piecewise smooth functions and associated variational problems. Com. Pure App. Math., 42:577–685, 1989.
- [16] S. Osher and J. Sethian. Fronts propagating with curvat. dependent speed: algs. based on Hamilton-Jacobi formulations. J. Comp. Phys., 79:12–49, 1988.
- [17] T. Patz and T. Preusser. Ambrosio-Tortorelli segmentation of stochastic images. In ECCV, pages 254–267. Springer-Verlag, 2010.
- [18] M. Pelillo, K. Siddiqi, and S. W. Zucker. Matching hierarchical structures using association graphs. IEEE Trans. Pattern Anal. Mach. Intell., 21:1105–1120, 1999.
- [19] H. H. Pien, M. Desai, and J. Shah. Segmentation of mr images using curve evolution and prior information. IJPRAI, 11(8):1233–1245, 1997.
- [20] T. Preußer, M. Droske, C. Garbe, M. Rumpf, and A. Telea. A phase field method for joint denoising, edge detection and motion estimation. SIAM Journal on Applied Mathematics, 68(3):599–618, 2007.
- [21] M. Proesman, E. Pauwels, and L. van Gool. Coupled geometry-driven diffusion equations for low-level vision. In B. Romeny, editor, Geometry Driven Diffusion in Computer Vision, Lecture Notes in Computer Science. Kluwer, 1994.
- [22] J. Shah. Segmentation by nonlinear diffusion. In CVPR, pages 202–207, 1991.
- [23] J. Shah. A common framework for curve evolution, segmentation and anisotropic diffusion. In CVPR, pages 136–142, 1996.
- [24] J. Shah. Skeletons and segmentation of shapes. Technical report, Northeastern University, 2005. http://www.math.neu.edu/ shah/publications.html.
- [25] J. Shah, H. H. Pien, and J. Gauch. Recovery of shapes of surfaces with discontinuities by fusion of shading and range data within a variational framework. IEEE Trans. on Image Processing, 5(8):1243–1251, 1996.
- [26] S. Tari. Hierarchical shape decomposition via level sets. In ISMM, pages 215–225. Springer-Verlag, 2009.
- [27] S. Tari, J. Shah, and H. Pien. A computationally efficient shape analysis via level sets. In MMBIA, pages 234–243, 1996.
- [28] S. Tari, J. Shah, and H. Pien. Extraction of shape skeletons from grayscale images. CVIU, 66(2):133–146, 1997.
- [29] S. Teboul, L. Blanc-F raud, G. Aubert, and M. Barlaud. Variational approach for edge preserving regularization using coupled PDE’s. IEEE Trans. Imag. Pr., 7:387–397, 1998.

Comments

There are no comments yet.