Shape analysis arises naturally in computer vision applications where geometric information plays an essential role. The shape of an object is a useful tool in fields such as: non-destructive reconstruction of archaeology and cultural heritage(Tal2014; Maaten2006); object classification and retrieval from large collections of images (Li2018; Safar2003); human action and pose recognition for gaming and entertainment (Chaudhry2013; Li2019ActionalStructuralGC); environment sensing in robot navigation and planning (Peters2016; Li2017abc); and industry for automatic visual quality inspection of product defects (6005551; 7071593).
We visually perceive shape as the collections of all the features that constitute an object. However, to perform computer-based shape analysis, one must rely on an accurate discrete mathematical representation of an object’s shape. This representation should exhibit the same geometric and topological properties inherent to the shape itself. Accordingly, we can think of shape representation as a way to store the shape’s information in a different format, which benefits speed, compactness, and efficiency.
Many authors have proposed a variety of shape representations such as voxel/pixel grids, point clouds, triangular meshes, medial axes, or signed distance functions Siddiqi1999; Toshev2012; Marie2016; Black2016; Zhang2004ab. These representations differ greatly in their formulation, and aim to provide a method for extracting descriptive features from objects, while also preserving their appearance and geometric properties Belongie2002; Abbasi1999; Sun2009; Esteves2018a; Maturana2015. However, these methods also have disadvantages that limit their application. For example, medial axis representations are highly sensitive to contour noise; voxel/pixel grids are inaccurate after isometric transformations; signed distance functions and triangular meshes are memory-consuming representations when high-frequency details of the shape want to be stored.
We focus this study on the medial axis, also called the topological skeleton. The medial axis represents shapes as a collection of one-dimensional curves that define the central axis (or backbone) of an object. It provides dimensionality reduction of the amount of data needed to represent an entire shape while preserving its topological structure. Moreover, the medial axis is a rotation equivariant shape representation because the medial axis of a rotated object should ideally be the rotated medial axis of the same object. The medial axis is also robust to small deformation, such as articulation, because of its graph-like structure. For instance, a human-like shape moving only its arm will not affect all of the points in the medial axis, only the connections between the arm’s nodes.
Despite its advantages, the medial representation is extremely sensitive to noise on the object’s contour Saha2016; Saha2017. Even small amounts of noise can cause erroneous sections of the skeleton called spurious branches. Consequently, many medial axis extraction algorithms are equipped with a mechanism to avoid or remove these spurious branches. There are two main strategies reported in the state-of-the-art to deal with this problem: prior smoothing of the curve representing the object’s boundary, and pruning the spurious branches after the medial axis’ computation. In the former, the smoothed boundary is obtained by removing small structures along the curve or surface. It is interesting to note that smoothing curves does not always result in a simplified skeleton younes2019; August99. Effective pruning techniques focus instead on criteria to evaluate the significance of individual medial axis branches. However, pruning often requires user-defined parameters that depend on the size and complexity of the object SHAKED1998156; Bai2007ab; Shen2011AB, making the pruning method domain-dependent. Moreover, some pruning strategies result in a violation of the equivariant property Saha2017; Saha2016. As a result, medial axis pruning is still an open problem in computer vision, and this problem is in need of noise-robust methods that concurrently preserve the isometric equivariance of the medial axis.
This paper presents a new method for medial axis pruning that employs mechanisms from the two aforementioned branch-removal strategies. Our method works by computing a controlled set of smoothed versions of the original shape via the discrete cosine transform. We combine these smoothed shapes’ medial axis to create a score function that rates points and branches by their degree of importance. We use our score function to prune spurious branches while preserving the medial axis’ ability to reconstruct the original object. Our method is robust to boundary noise and exhibits isometric equivariance.
We benchmark our approach on three datasets of 2D and 3D segmented objects. We use the Kimia216 (Sebastian2004) and the Animal dataset (Bai2009) to evaluate 2D medial axis extraction. These two datasets provide a method to assess 2D medial axis extraction in the presence of intra-class variation. We also use the University of Groningen Benchmark (Sobiecki2014; Sobiecki2013; Chaussard2011) to evaluate our approach on 3D objects. Our results show that our approach achieves competitive results on isometric equivariance and noise robustness compared to the state-of-the-art.
The main contributions of this paper are summarized as follows:
We define a novel method to compute medial axes that are robust to several degrees of boundary noise without losing the capacity to reconstruct the original object.
Our computation pipeline guarantees that the isometric equivariance of the medial axis is preserved.
The definition of our score function allows for a medial axis pruning that is efficiently computed in parallel.
2 Related work
Many algorithms and strategies exist to extract the medial axis and simplify it when affected by contour noise. This section briefly reviews the most representative algorithms for medial axis computation and discusses their key advantages and disadvantages.
2.1 The Medial Axis
Blum Blum1967 first introduced the medial axis as an analogy of a fire propagating with uniform velocity on a grass field. The field is assumed to have the form of a given shape. If one “sets fire” at all boundary points, the medial axis is the set of quench points. There are other equivalent definitions of the medial axis. In this work we use a geometric definition as follows:
Medial Axis. Let be a connected bounded domain in , and two points such that . The medial axis of is defined as all the points where is the center of a maximal ball of radius that is inscribed inside . Formally,
The medial axis, together with the associated radius of the maximally inscribed ball, is called the medial axis transform (). The medial axis transform is a complete shape descriptor, meaning that it can be used to reconstruct the shape of the original domain. In some work, and are also referred to as shape skeletonization. Figure 1 shows an example of a 2D shape and its medial axis as the center of maximal discs. In definition 1 may result in a 2-dimensional medial axis sometimes called the medial surface. We will restrict our examples and analysis to only one-dimensional medial axes.
2.2 Medial Axis Computation
There are three primary mechanisms to compute the : 1) layer by layer morphological erosion also called thinning methods, 2) extraction of the medial axis from the edges of the Voronoi diagram generated by the boundary points, and 3) detection of ridges in distance map of the boundary points. In digital spaces, only an approximation to the “true medial axis” can be extracted.
When using thinning methods Dlotko14; LOHOU2004171; 5396343; Nemeth2011, points which belong to are deleted from the outer boundary first. Later, the deletion proceeds iteratively inside until it results in a single-pixel wide medial axis. The medial axis by thinning can be approximated in terms of erosion and opening morphological operations (Zhang1984). Thinning methods are easy to implement in a discrete setting, but they are not robust to isometric transformations.
The most well-known algorithm for thinning skeletonization is perhaps the Zhang Suen (Zhang1984) algorithm. However, other approaches have been developed using similar principles (Viswanathan2013; 5396343; Nemeth2011), mainly focused on parallel computation.
Another way to estimate the medial axis works by computing the Voronoi Diagram (VD) of a polygonal approximation of the object’s contour. The contour is expressed as line segments in 2D or a polygonal mesh in 3D. The seed points for the VD are the vertices of the polygonal representation. The medial axis is then computed as the union of all of the edgesof the VD, such that the points , and are not neighbors in the polygonal approximation (Ogniewicz1992).
Voronoi skeletonization methods preserve the topology of . However, a suitable polygonal approximation of an object is crucial to guarantee the medial axis’ accuracy. Noise in the boundary forms convex areas in the contour, which induce spurious branches on the medial axis. In general, the better the polygonal approximation, the closer the Voronoi skeleton will be to the real medial axis. Nevertheless, this is an expensive process, particularly for large and complex objects (Saha2017).
The most common methods to extract the medial axis are those based on the distance transform. Within these methods, the medial axis is computed as the ridges of the distance transform inside the object Arcelli2011; Saha2016; ARCELLI1988361; Hesselink2008; Couprie2007; Postolski2014; Sato2000. This interpretation of the medial axis follows definition 1, because the centers of the maximal balls are located on points along the ridgeline of the distance transform, and the radius of the balls correspond to the distance value at .
When computed in a discrete framework, distance-based approaches suffer from the same isometric robustness limitations as thinning and Voronoi methods (Saha2017). Moreover, noise in the contour produced by a low discretization resolution directly affects the medial axis’ computation by introducing artificial ridges in the distance transform and, consequently, spurious branches.
2.3 Medial Axis Simplification
The medial axis’ sensitivity to boundary noise limits its applications to real-life problems (5395557). Even negligible boundary noise can cause spurious branches, as shown in Figure 2.
One strategy for removal of spurious branches consists of computing as the medial axis of a smoother version of the shape, (Rumpf2002; Mokhtarian1992789). The main disadvantage of this approach is that, in most cases, the resulting medial axis is not a good approximation. Additionally, the smoothing of can potentially change its topology. Miklos et al. Miklos:2010:DSA:1778765.1778838 introduced a slightly different approach they call Scaled Axis Transform (SAT). The SAT involves scaling the distance transform and computing the medial axis of the original un-scaled shape as the medial axis of the scaled one. However, in (Postolski2014), the authors show that the SAT is not necessarily a subset of the medial axis of the original shape. Instead, they propose a solution that guarantees a better approximation by including additional constraints on the scaled distance transform.
Another method to overcome the noise sensitivity limitation of the medial axis is spurious branch pruning. Pruning methods are a family of regularization processes incorporated in most medial axis extraction algorithms SHAKED1998156. Effective pruning techniques focus on different criteria to evaluate the significance of medial axis branches. Thus, the decision is whether to remove the branch (and its points) or not. We can say that pruning methods are adequate if the resulting is noticeably simplified while preserving the topology and its ability to reconstruct the original object. Most pruning methods rely on ad hocheuristic rules, which are invented and often reinvented in a variety of equivalent application-driven formulations SHAKED1998156. Some authors apply these rules while computing all medial axis points. Others do so by removing branches that are considered useless after the computation (Saha2017; Gao2018; Hesselink2008).
One of the most popular pruning methods is Couprie et al. (Couprie2007). They consider the angle formed by a point , and its two closest boundary points denoted by the set . This solution removes points from for which is lower than a constant threshold. This criterion allows different scales within a shape but generally leads to an unconnected medial axis. Another pruning method found in the state-of-the-art is the work of Hesselink et al. (Hesselink2008). They introduce the Gamma Integer Medial Axis (GIMA), where a point belongs to the medial axis if the distance between its two closest boundary points is at least equal to .
Many pruning methods rely on the distance transform . The distance transform acts as a generator function for the medial axis, such that points if and only if they satisfy some constraint involving their distance to the boundary. However, other authors have proposed alternative generator functions in their pruning strategies Gorelick2006; Aubert2014.
For example, (Gorelick2006) and (Aubert2014) introduce what they call Poisson skeletons by approximating as the solution of the Poisson equation with constant source function. Poisson skeletons rely on a solid mathematical formulation. Among other concepts, they use the local minimums and maximums of the curvature of . However, when such methodology is applied in a discrete environment, many spurious branches appear due to the need to define the length of a kernel size to estimate these local extreme points.
We propose a new pruning approach to remove spurious branches from the medial axis of a -dimensional closed shape . We call our method the Cosine-Pruned Medial Axis (CPMA). The CPMA works by filtering out points from the Euclidean Medial Axis with a score function . We define the function by aggregating the medial axis of controlled smoothed versions of . Our formulation of must yield high values at points that belong to the real medial axis and low values at points that belong to spurious branches. Additionally, we require to be equivariant to isometric transformations.
3.1 The Cosine-Pruned Medial Axis
Let us represent as a square binary image with a resolution of pixels. We start the computation of the CPMA by estimating a set of smoothed versions of the via the Discrete Cosine Transform (DCT) and its inverse (IDCT):
are coordinates in the frequency domain, andare the spatial coordinates of the Euclidean space where is defined. The values of and are determined by:
The DCT is closely related to the discrete Fourier transform of real valued-functions. However, it has better energy compaction properties with only a few of the transform coefficients representing the majority of the energy. Multidimensional variants of the various DCT types follow directly from the one-dimensional definition. They are simply a separable product along each dimension.
Let us now denote by with the reconstructions of using only the first frequencies as per equation 2. We seek to obtain a
acting as a sort of probability indicating how likely it is for a pointto be in the medial axis of . Points on relevant branches will appear regularly in the smoothed shapes’ medial axis, resulting in high score function values. In contrast, spurious branches will only appear occasionally, resulting in low values.
Score function. Let be a square binary image such that . Let also be the -frequency reconstruction of via the IDCT. We define as the per pixel average over a set of estimations of the on the smoothed shapes .
The score function is defined for all in the domain of . Notice that we represent as another binary image where only when belongs to the medial axis. Finally, with , we have all the elements to present our definition of the CPMA.
Cosine-Pruned Medial Axis Given a binary image representing a shape , the consist of all the pairs such that is greater than a threshold :
We empirically set the value of the threshold to . However, we conducted an additional experiment to show that this value is consistent across different shapes.
Although the CPMA results in a noise-free , there is no restriction in its formulation to force the CPMA to create a connected medial axis. We solve this by finding all the disconnected pieces of the CPMA. Later, we connect them using a minimum distance criterion , where and are endpoints of two distinct pieces. However, neither the Euclidean distance nor the geodesic distance are suitable criteria because they lead to connections between nodes that do not follow the medial axis (See figure 4). We instead connect and with a minimum energy path using an energy function . We must guarantee that has high values when is close to and low values when is close to the medial axis. This way, we enforce the paths to be close to the . We call the result of this strategy the Connected CPMA (C-CPMA). In section 3.3, we provide details of computation.
3.2 Isometric Equivariance of the CPMA
The distance transform-based medial axis depends only on the shape , not on the position or size in the embedding Euclidean space. Therefore the medial axis should be equivariant under isometric transformations.
Let be the medial axis transform of a connected bounded domain embedded in , and let be an isometric transformation in . The square matrix is a composition of any number or rotations and reflection matrices, and is an -dimensional vector. We say that
-dimensional vector. We say thatis equivariant to any such that .
Recall that is an isometric transformation and thus it is invertible and preserves the Euclidean distance. The previous statement implies that is an isomorphic map between an open ball and . Consequently, if is a maximal ball in then from definition 1 we have that .
Let us now define . Applying to every element of , we have:
Moreover, the CPMA depends primarily on , which also holds the isometric equivariant property.
Let be the score function of as per definition 3, and let be an isometric transformation. We say that is equivariant to any such that .
However, in a discrete domain, this equivariance is only an approximation because points on both and are constrained to be on a fixed, regular grid. In a continuous domain, it is easy to demonstrate that the cosine transform has exact isometric equivariance.
3.3 Implementation Details
To compute the CPMA enforcing the connectivity, we create a lattice graph . A point in the domain of is a node of , if and only if . The node shares an edge with all its neighbors in the lattice only if the neighbors are also inside . We used an -connectivity neighborhood in 2D and a -connectivity neighborhood in 3D.
In order to determine the minimum energy path between pairs of pixels/voxels, we compute the minimum distance path inside using Dijkstra’s algorithm. We assign weights to every edge with values extracted from . Given , we compute the energy of every edge as follows:
This method guarantees connectivity, but it is inefficient because of the minimum energy path’s iterative computation. We sacrifice performance in favor of connectivity. We include the pseudo-code to compute the CPMA and the C-CPMA in algorithms 1 and 2.
The CPMA only relies on one parameter, . The value of is the threshold that determines whether a point of is a medial axis point. We empirically set the value of . However, in section 5, we present the result of an additional experiment to show how sensitive the CPMA is to different threshold values.
Another essential consideration when computing the CPMA is the maximum frequency used to reconstruct the original shape through the IDCT. Using less than frequencies enables a faster computation of the CPMA without losing accuracy. We found that using frequencies greater than does not yield significant improvement for the CPMA.
In this section, we describe experiments used to evaluate our approach compared to state-of-the-art medial axis pruning methods.
We chose three extensively used datasets of 2D and 3D segmented objects to evaluate our methodology on medial axis extraction robustness. These datasets are part of the accepted benchmarks in literature, enabling us to compare our results.
Kimia216 dataset (Sebastian2004)
It consists of 18 classes of different shapes with 12 samples in each class. The dataset’s images are a collection of slightly different views of a set of shapes with varying topology. Figure 4(a) shows two samples from each class. Contour noise and random rotations are also present in some images in the dataset. Kimia216 has been largely used to test a wide range of medial axis extraction algorithms. Because of the large variety of shapes, we assume that this benchmark ensured a fair comparison with the state-of-the-art.
Animal2000 dataset (Bai2009)
The Animal2000 dataset helps us to evaluate the properties of our approach in the presence of non-rigid transformations. It contains 2000 silhouettes of animals in 20 categories. Each category consists of 100 images of the same type of animal in different poses (Figure 4(b)). Because silhouettes in Animal2000 come from real images, each class is characterized by a large intra-class variation.
University of Groningen’s Benchmark
This set of 3D meshes is commonly found in the literature to evaluate medial axis extraction methods in 3D (Sobiecki2014; Sobiecki2013; Chaussard2011). It includes scanned and synthetic shapes taken from other popular datasets. It contains shapes with and without holes, shapes of varying thickness, and shapes with smooth and noisy boundaries. See Figure 4(c). All meshes are pre-processed, ensuring a consistent orientation, closeness, non-duplicated vertices, and no degenerate faces.
4.2 Sensitivity to Noise and Equivariance Analysis
To compare the robustness of a medial axis extraction method, we adopt an evaluation strategy similar to (Chaussard2011). Consequently, we measure the similarity between the medial axis of a shape and . The shape derives from a “perturbation” of . We are interested in evaluating how well our methodology responds to induced noise on the contour/surface. We are also interested in assessing how stable the CPMA responds in the presence of rotations of to test its isometric equivariance.
We employ the Hausdorff distance (), and Dubuisson-Jain dissimilarity () as metrics between shapes. The Dubuisson-Jain similarity is a normalization of the Hausdorff distance (Dubuisson2002), which aims to overcome
sensitivity to outliers. The Dubuisson-Jain similarity between point setsand is defined as:
We must first choose a strategy to induce noise to the boundary to evaluate the noise sensitivity. We use a stochastic approach denoted by , where a random number of points are deformed by a vector
in the direction orthogonal to the boundary, with a deformation magnitude that is normally distributed,. This noise model is recursively applied times to every shape in our datasets. We denote as the medial axis of a shape after applying the noise model times. In our experiments, we used noise levels.
For every object in each dataset, we compared the medial axis with the noisy versions to determine how sensitive a method is to boundary noise.
The medial axis is ideally an isometric equivariant shape representation so that . Due to sampling factors, this relationship is only an approximation. However, we can measure the equivariance by comparing with . The more similar they are, the more equivariant the method.
Because the translation equivariance is trivial, we evaluate isometric equivariance only with rotations of . We do not use reflections because the properties of rotation matrices we use in this study can be extended to reflection matrices. In our experiments, 2D rotations are counterclockwise in the range degrees around the origin. We use rotations at regular intervals. In 3D, we use a combination of azimuthal () and elevation () rotations around the origin. The angles and take values each degrees.
4.3 Comparative Studies
We chose seven of the most relevant methods in the scientific literature to compare with CPMA extraction results. Each method was selected based on a careful review of the state-of-the-art. These methods illustrate the variety of approaches that authors employ to prune the medial axis. Notice that the first method we used in our study is the un-pruned .
Table 1 summarizes all of the methods in our comparative study. In many cases, the performance of a pruning method depended on its parametrization. We selected parameters for all of the methods following the best performance parametrization reported in the state-of-the-art.
|Dim.||Method||Full name||Parameter description|
|2D||MAT Blum1967||Medial Axis Transform||N/A|
|Thinning Zhang1984||Zhang-Suen Algorithm||N/A|
|GIMA Hesselink2008||Gamma Integer Medial Axis||: minimum distance between and , , the neighborhood of .|
|BEMA Couprie2007||Bisector Euclidean Medial Axis||: angle formed by the point and the two projections and , .|
|SAT Giesen2009||Scale Axis Transform||: scale factor to resize .|
|SFEMA Postolski2014||Scale Filtered Euclidean Medial Axis||: scale factor to all balls in the .|
|Poisson skel. Aubert2014||Poisson Skeleton||: window size to find the local maximum of contour curvature.|
|3D||Thinning (Zhang1984)||Zhang-Suen Algorithm||N/A|
|TEASAR (Sato2000)||Tree-structure skeleton extraction||N/A|
In this section, we present and discuss the results of our approach on medial axis pruning. We focus on two properties: 1) robustness to noise of the contour and 2) isometric equivariance.
5.1 Stability Under Boundary Noise
We compared the stability of the CPMA under boundary noise against other approaches in Table 1. We conducted our experiments on Kimia216 and the Animal2000 Dataset for 2D images. Additionally, we used a set of three-dimensional triangular meshes from the Groningen Benchmark for 3D experimentation.
For our noise sensitivity experiments, we applied times the noise model to every object of each dataset. We then computed their using every method in Table 1 with different parameters. Finally, each was compared with using both the Hausdorff distance and Dubuisson-Jain dissimilarity. We report the per method average of each metric over all the elements of each dataset.
First, we tested our medial axis pruning method on the Kimia216 dataset and present the results in Table 2. The table shows that the CPMA and the C-CPMA are competitive against state-of-the-art methods for medial axis extraction. Our results show similar performance to two state-of-the-art methods: the GIMA and SFEMA. The CPMA and C-CPMA also performed better than Poisson Skeletons, SAT, topological thinning, and the MAT itself. For visual comparison, Figure 6 (top row) shows both Hausdorff distance and Dubuisson-Jain dissimilarity against noise level. The figure only displays the best parametrization of every method to improve visualization. It is clear that the CPMA and the C-CPMA are among the three best results when we use the Dubuisson-Jain dissimilarity metric. However, we observe a decrease in performance when we the Hausdorff distance metric. We believe this occurs because of Hausdorff’s distance sensitivity to outliers.
|Poisson skel. (w=0.05)||11.43||11.16||11.26||12.73||2.46||3.05||3.27||3.53|
|Poisson skel. (w=0.10)||15.60||15.48||16.07||17.35||3.62||4.07||4.19||4.56|
|Poisson skel. (w=0.20)||17.71||18.02||19.54||21.35||5.00||5.38||5.63||6.20|
The Animal2000 dataset contains nearly ten times more shapes than Kimia216. This leads to more variation between shapes, and therefore a more challenging setting. Table 3 shows similar results compared to Kimia216, confirming that the noise invariant properties of the CPMA still hold in a more robust dataset. The GIMA and the SFEMA are still the best methods measured with the Dubuisson-Jain dissimilarity, closely followed by both the CPMA and the C-CPMA. Results of using the Dubuisson-Jain dissimilarity as a metric show that the CPMA is close to methods such as BEMA and SFEMA. However, the results are not as good as when using the Hausdorff distance metric. Figure 6 (middle row) depicts the best performance for every method in comparison to ours. Our experiment’s results suggest that the CPMA noise invariant properties generalize across different datasets.
|Poisson skel. (w=0.05)||11.94||13.68||15.35||17.03||3.08||3.63||4.01||4.20|
|Poisson skel. (w=0.10)||14.55||17.11||18.64||20.10||3.55||4.08||4.68||4.94|
|Poisson skel. (w=0.20)||17.37||20.35||21.67||23.69||3.94||4.69||5.33||5.73|
For our 3D experiments, we selected objects from the Groningen Benchmark, reflecting the most common shapes used in the literature. Each object was voxelized to a binary voxel grid with resolution . This resolution offered sufficient details as well as a sufficiently low computational cost. In contrast to the 2D case, we applied only times to the 3D object. We did this for two reasons: 1) to reduce computational complexity and 2) noise tends to be more extreme in 3D at the chosen resolution. The results on the Groningen dataset are shown in Table 4 and Figure 6 (bottom row). Notice that both the CPMA and C-CPMA achieved the best results among the other methods when compared with the dissimilarity measure. These results show that our methodology has noise-invariance properties, and it is stable in the presence of small surface deformation. However, the results show unusual patterns when compared with the Hausdorff distance. In fact, for some methods, the metric decreases when the noise level increases. We attribute this behavior to the outlier sensibility of the Hausdorff distance.
We complete the noise stability analysis showing examples of the computed with our methodology, and compared to the other methods in this study, Figure 7.
5.2 Sensitivity to Rotations
We measured the dissimilarity between and for different instances, and different definitions of the medial axis transform. The lower this dissimilarity, the more stable the method is under rotation. The rotation sensitivity analysis on the Kimia216 dataset is summarized in Table 5 and Figure 8 (top row). The results show that the curves of the CPMA and the C-CPMA fall near the average of the rest of the methods achieving state-of-the-art performance. The results also surpassed several methods, such as Poisson skeleton, SAT, and thinning methods. Notice that when using the dissimilarity metric the CPMA, the GIMA, the SFEMA, and the BEMA form a subgroup that performs significantly better compared to the others. Moreover, the performance of these methods oscillates around a value of dissimilarity of around pixel on average. The intuition for this result is that regardless of the rotation, skeletons computed with these methods vary only at one pixel. Consequently, we can claim that they exhibit rotation equivariance.
|Poisson skel. (w=0.05)||12.83||13.32||8.93||3.21||3.28||1.30|
|Poisson skel. (w=0.10)||16.36||17.03||10.17||4.24||4.30||1.78|
|Poisson skel. (w=0.20)||18.94||19.90||9.65||5.61||5.66||2.69|
We applied the same analysis to the Animal2000 dataset achieving similar results. In this case, the CPMA and the C-CPMA ranked third and fourth, respectively, among all methods when we used the dissimilarity metric. The results for all methods and parameters are presented in Table 6. As before, we also present a summary with the best parametrization for each method in Figure 8 (middle row) to facilitate the interpretation. Notice that due to the larger number of objects in the Animal2000 dataset, the curves for every method appear to be smoother, highlighting stability across different rotation angles and shapes.
|Poisson skel. (w=0.05)||12.88||12.72||10.18||3.33||3.40||1.54|
|Poisson skel. (w=0.10)||15.02||15.41||10.58||3.67||3.89||1.89|
|Poisson skel. (w=0.20)||16.83||17.20||8.67||4.07||4.33||1.85|
Finally, we conducted the rotation sensitivity analysis on the 3D dataset. the results are summarized in Figure 8 (bottom row). The image shows the four 3D medial axis extraction methods we compared in our study for combinations of azimuthal and elevation angles. This figure shows how both the Hausdorff distance and the Dubuisson-Jain dissimilarity become higher when the rotation becomes more extreme, except in the case of C-CPMA. We believe this behavior is due to the connectivity enforcement mitigating the gaps in the medial axis, and reducing the metrics.
5.3 Hyper-parameter Selection
We tested our medial axis pruning method on the Kimia216 dataset. Many medial axis pruning methods depend on hyper-parameters to accurately estimate the medial axis Hesselink2008; Couprie2007; Giesen2009; Postolski2014. These parameters usually have a physical meaning in the context of the object whose medial axis they seek to estimate. Often, the parameters are distances or angles formed between points inside the object. In other work, some authors create score functions like ours, intending to use its values as a filter parameter to remove points on spurious branches of the . In most cases, however, such parameters are subject to factors like resolution or scale. Thus, we conducted another experiment to test the sensitivity of the CPMA to the pruning parameter at different scale factors of the input object.
Figure 9 shows the average of the Jaccard index as the reconstruction metric vs the values of . We compared an object against its reconstruction over all images in Kimia216 dataset. The figure shows how high values of deteriorate the reconstruction, while lower values do not prune enough spurious branches. From the figure, we can infer that values around offer a good trade-off between reconstruction and branch pruning. Moreover, around this value, the standard deviation reaches its minimum value suggesting optimal performance regardless of the object. Because the value of is stable for different scale factors, we conclude that scale does not affect the selection of the threshold.
6 Conclusions and Future Work
We presented the CPMA, a new method for medial axis pruning with noise robustness and equivariance to isometric transformations. Our method leverages the discrete cosine transform to compute a score function that rates the importance of individual points and branches within the medial axis of a shape.
Our pruning approach delivers competitive results compared to the state-of-the-art. The results of our experiments show that our method is robust to boundary noise. Additionally, it is also equivariant to isometric transformations, and it is capable of producing a stable and connected medial axis even in scenarios with significant perturbations of the contour.
The CPMA can be efficiently computed in parallel because it depends on an aggregation of reconstructions of the original shape. Each reconstruction is independent of the others, which allows the parallelism.
We believe our work leaves room for improvement. Thus we identified the following potential for future work.
All of the 3D objects we used come as 3D triangular meshes. To compute the CPMA, we discretize the meshes to fix a resolution of voxels. The discretization introduces two issues: 1) the objects lose small details in their structure, affecting the overall performance, and 2) the isometric equivariance decreases because rotated voxels will not usually align perfectly with non-rotated voxels.
Our algorithm for connectivity enforcement relies on iterative computations of Dijkstra’s algorithm for finding the minimum energy path between two pieces of the unconnected medial axis. Better strategies to compute the paths could increase the efficiency.
We want to thank the GRASP Lab at the University of Pennsylvania for providing the computational resources to conduct this investigation.