A medial representation of an object encodes its solid geometry as the union of a collection of spheres of different radii and origins; see Figure 1. While volumetric or surface mesh representations are more commonly used in computer graphics and computer vision, since its introduction by Blum et al. [blum], medial representations have found applications in many 2D/3D geometric problems such as animation [pinocchio], fabrication [musialski2016fabrication], image processing [amat], shape analysis [shapediam], and real-time tracking [hmodel]. The process of converting an input model into a medial representation is generally referred to as the Medial Axis Transform (MAT), and the collection of origins of the spheres in a medial representation form a medial skeleton.
A common pitfall of medial axis methods is their sensitivity to noise: in a traditional medial representation, new branches of the medial skeleton may form at all negative local curvature extrema. If our shape is represented via a piecewise linear boundary, such as a polygonal mesh in 3D, spurious branches form when the surface exhibits even minor levels of noise; see [skelstar, Fig.9]. These issues are typically resolved by resorting to postprocessing, in which sets of spheres are filtered out according to various engineered criteria. We present a new algorithm for computing the medial axis transform of an oriented point set which is naturally capable of handling noise, outliers, and other artifacts that create characteristic problems for traditional methods.
As summarized in Figure 1 and [skelstar], the medial axis transform may be defined in a few ways, with each definition producing equally valid and useful representations. In practice, each definition of the medial representation leads to a different way to compute the medial axis. For example, the normal flow variant leads to the commonly employed voxel thinning algorithms [saha2016survey], while the equidistance definition leads to techniques leveraging Voronoi diagrams [brandt92]. In this paper, we build over what is likely the most well known definition of the medial axis transform:
The Medial Axis Transform of is the set of centers and corresponding radii of all maximally inscribed circles/spheres in .
While previous work such as Ma et al. [bubble] has proposed methods to construct medial axis representations using this definition, it is interesting to note that they have treated the problem from a combinatorial geometry standpoint. Instead, we consider the medial axis transform from a numerical geometry perspective, where the medial axis is given by the solution of an optimization problem. We achieve this by expressing the concepts of maximality and inscription from Def. 1.1
in a least squares form. The robustness of the approach to imperfections arises from the fact that least squares optimization attempts to find an approximate, rather than exact, solution to the given problem. We are motivated in our approach by considering a least-squares problem as a maximum likelihood estimate of a function in the presence of noise obeying a Gaussian probability distribution.
Our method takes an oriented point cloud as input, and produces an unconnected medial point cloud as output by minimizing a combination of a maximality energy, and an inscription energy. Our key technical challenge is in formulating these energies correctly; our maximality energy is designed to have a constant magnitude, ensuring that the optimization energy of each medial sphere is constant regardless of its radius, and our inscription energy is based around a locally supported approximation of the signed distance function of the point cloud. In order to prevent spheres from sliding towards local maxima of local shape thickness, we introduce a pinning constraint as a quadratic barrier energy. The resulting optimization problem is quadratic, with differentiable but non-linear energy terms, and we solve it with an iterative Gauss-Newton solver.
Contributions and Evaluation
Our main contributions are a novel interpretation of a problem that is classically solved by standard computational geometry as a numerical geometry problem, and a novel algorithm for computing the medial axis transform of an oriented point set that inherently handles imperfections in the input. We present a number of qualitative results throughout the paper for both 2D and 3D oriented point clouds. Finally, we also present side-by-side quantitative evaluations of the proposed algorithm against several state-of-the-art methods.
2 Related Work
Many techniques exist to compute the medial axis transformation, as detailed in a recent survey [skelstar]. We focus our discussion around two central aspects: methods that operate on surface 2D/3D representations (e.g. point clouds and triangular meshes), and methods that attempt to resolve the instability of medial axis via post processing.
2.1 Medial axis computation of sampled surfaces
Assuming the surface is sampled by a sufficiently dense set of points, the Voronoi diagram can be used to compute the medial axis transform with relative ease. For a closed boundary in 2D, any interior Voronoi vertex, as well as Voronoi edges connecting interior vertices, approximate the medial axis with a convergence guarantee [brandt92]. Unfortunately, Amenta et al. [amenta99] has shown how this property does not hold in 3D due to sliver tetrahedra. Thankfully, as we increase the sampling rate, the Voronoi poles (see definition in [amenta99]) do converge to the medial axis, allowing the use of 3D Voronoi diagrams for the task at hand. These methods suffer two fundamental shortcomings: ① they are global and optimize the entire set of centers at the same time, and ② the extracted axis is heavily susceptible to even minor levels of noise; see Figure 2.
Leveraging an oriented sampling, Ma et al. [bubble] proposed an alternative way to compute the transform by marrying the maximally inscribed definition of Figure 1(a) to the bi-tangency of Figure 1(d). Unlike Voronoi methods, which consider the entire point set at once, their method can compute a maximal sphere at a point in isolation, leading to extremely efficient GPU implementations [jalba_pami13]. These methods represent the most efficient way of computing the medial axis, but, similarly to Voronoi methods, they suffer stability issues; see Figure 3. Another method for computing a local approximation to the axis was proposed by Shapira et al. [shapediam] via casting rays in a cone oriented along the anti-normal. The distance between the point and the intersection with the surface is aggregated by a robust function (e.g. median) to estimate the shape diameter function. While this approximation has found widespread use as a shape descriptor, the radius estimate suffer of bias, and the algorithm does not generalize to point clouds.
Shape approximation methods
Recently, a new class of techniques has been proposed which attempt to approximate watertight surfaces via Sphere Meshes – linearly swept spherical primitives [spheremesh, anispheremesh]. This is achieved via local mesh decimation relying on iterative edge collapses, where spherical quadrics are employed in place of the traditional quadric metrics [qslim]. In many cases the produced model resembles a medial axis. When executed on 3D data the result can contain tetrahedra, however, the medial axis of a 3D shape is known to consist only of points, curves, and surfaces. Addressing these concerns, the Medial Meshes work by Sun et al. [medialmesh] extended sphere meshes to decimate a medial axis mesh, and discard unstable branches. Marrying sphere meshes to medial meshes, Li et al. [qmat] followed up this work and proposed QMAT, a more computationally efficient version based on spherical quadrics. While these techniques, in juxtaposition to our local method, are global, they can only cope with minor levels of noise: “with very noisy input, however, the simplified medial mesh is not a stable representation”; see [medialmesh, Fig.9]. The follow-up work by Li et al. [qmat] performs slightly better as it can optimize for sphere centers, but our algorithm can still cope with noise that is one order of magnitude larger; see [qmat, Fig.16].
2.2 Instability and filtering techniques
Techniques that do not attempt to produce an approximation suffer from instability when computing the medial axis. Filtering techniques attempt to remove portions of the medial axis that do not contribute significantly to the geometry reconstructed as the union of medial spheres. As shown in [scaleaxis3d, Fig.2], this works fairly well for inputs with little or no noise. Conversely, with large noise (and/or outliers), these methods become mostly inappropriate; see Figure 2.
Angle filtering – -medial axis
One way to identify the significance of a point is the largest angle formed by the center of the corresponding maximal sphere and two of its tangent points on the shape boundary. The -medial axis of [foskey], filters out balls associated with low aperture angle. While this filtering can disconnect portions of the medial axis, see Figure 2(c), the issue can be avoided by homotopy-preserving pruning [attali].
Distance filtering – -medial axis
Another metric for filtering discards a sphere whenever its tangent points lie on the surface below a certain distance [lambda, powercrust]. As Figure 2(d) illustrates, this results in a loss of features even before all noise has been removed. This shortcoming makes these solutions inappropriate whenever the input shape contains structures at different scales.
Scale filtering – -medial axis
The scale axis transform introduced by Giesen et al. [scaleaxis] and extended to 3D by Miklos et al. [scaleaxis3d] are methods built over the maximality property of the medial axis. First medial spheres are scaled by a given value, and spheres that are no longer maximal (i.e. contained in another sphere) get discarded. The medial axis of the scaled union of balls is then computed, and its spheres unscaled by a factor . This method is global as it processes the whole point set at once, and computes multiple Voronoi diagrams in the process, resulting in reduced computational efficiency – e.g. for a mesh with vertices, see [scaleaxis3d, Tab.1].
3 Technical details
In Section 3.1, we start by reviewing the highly relevant method of Ma et al. [bubble], which computes the medial axis by searching for maximal spheres via local analysis. We then formulate our numerical optimization to compute medial spheres in Section 3.2.
We are given in input an oriented point cloud drawn from the solid object with watertight surface/boundary
. The cloud is affected by noise with standard deviation. We give all numerical values of and other parameters which represent distance values as percentages relative to the diagonal of the bounding box of the input shape. With we refer to the internal medial axis of ; see [skelstar, Fig.2]. With we indicate a sphere centered at of radius , and with the index of the solver iteration.
3.1 The sphere-shrinking algorithm – Figure 3
The algorithm proposed by Ma et al. [bubble] is an iterative method that shrinks an initially large sphere until it satisfies the medial axis properties. Assuming is a point on the boundary, this process leverages two properties: ① that the sphere passing through should be empty; and ② that for a smooth input curve is parallel to the contact point normal . The sphere-shrinking algorithm is initialized with a large sphere, passing through , containing the entire point set, and whose center lies along the ray . Whenever the distance to the closest point from is less than (i.e. the sphere is not empty), the center is updated by computing the sphere tangent to and passing through ; see Figure 3. Finally, the loop terminates when the sphere radius change across iterations is below numerical precision. This algorithm is highly efficient as each sample can be processed completely independently from the others, but as it treats all points as hard constraints in a combinatorial manner it does not cope well with noisy inputs. Our formulation borrows from this one, but generalizes it by ① reformulating it into a continuous optimization problem, and ② making it more robust to noise and outliers.
3.2 Optimizing for maximally inscribed spheres
We define a least-squares optimization capable of generating maximally inscribed spheres given an oriented point cloud . Our optimization energy for a sphere consists of the combination of two terms:
We will now proceed to describe these terms in detail, and then provide comparison against alternative formulations in Section 4.2.
The maximality energy creates a constant positive pressure term that tends to increase the sphere size at each iteration. By design, we define this energy to have a constant magnitude, that is, a contribution to the optimization energy that is constant regardless of the radius of the given medial sphere. This is achieved by considering the radius at the previous optimization iteration with a constant offset :
Consider a signed distance function (SDF) , where for an inside the shape. If the expression of this function were available to us, an inscription constraint could be easily expressed by the inequality , which we can convert in our least squares formalism as:
where the ramp function only penalizes a sphere when it violates an inscription constraint. Unfortunately, an SDF function for our oriented point set is not readily available without computing a full surface reconstruction of the point cloud [reconstar]. However, the family of moving least square (MLS) methods are capable of building locally supported approximations in a neighborhood of – that is, as . For example, the MLS formulation by Kolluri [kolluri_talg08] approximates as a weighted sum of locally supported point-to-plane functions:
where in our case and
is a smoothly decaying radial basis function[apss] with compact support :
Note that as Equation 3 evaluates the function at the medial center, which could be potentially far from , this representation is not immediately appropriate. However, in the context of registration, several works have shown how a quadratic approximation of can be built by appropriately blending point-to-point with point-to-plane distance functions [sdfgeometry, mitra2004registration]. In more detail, point-to-point distances are a good approximation of in the far-field (i.e. for ), while point-to-plane distances are more suitable in the near-field (i.e. for ). Let us first define
, the projection of the center on the hyperplane of point
. We can use this to interpolate between the two metrics:
where represents the distance from the sphere to a plane, is the euclidean distance from the sphere to a point, and is the linear interpolation operator; see Figure 4. Based on the geometry of , we can then re-formulate our inscription with respect to each oriented point . Analogously to Equation 4, this energy is accumulated over all points in :
The parameter defines the scale used for blending between distance types, and limits how far outside of a sphere a point may be before its contribution falls to zero. As is typical in robust optimization (e.g. IRLS) the weights are computed with respect to the parameters at the previous iteration – inscription is evaluated only for points within, or in proximity of the sphere in its previous geometric configuration .
The medial axis provides an estimate of the local thickness of the shape through its sphere local radius. However, as our variational formulation attempts to create larger spheres, nothing prevents a sphere from traveling along medial branches wherever we have a non-vanishing medial radius gradient on ; e.g. a sphere would slide from the tip of a cone to its base; see Figure 6. We can avoid this issue by “pinning” medial spheres. Generalizing the exact constraints of the sphere-shrinking algorithm by [bubble], we subject the sphere associated with a given point to the hard constraint , which keeps the sphere in contact with a sphere of radius centered at . We include this constraint in our optimization via the penalty method [penalty], yielding a quadratic barrier energy:
Similarly to [lop], our optimization induces the definition of medial sphere as the fixed point solution of an update equation:
Our optimization problem is quadratic, but while its energy terms are differentiable, they are not linear. Hence, we iteratively compute a solution via Gauss-Newton. This requires the linearization of the arguments of the quadratic functions with respect to and , which is straightforward in our setting.
Due to the independent nature of the per-sphere optimization, our implementation is straightforward. At each step we compute Jacobian matrices and residual vectors separately for each sphere and thus are only required to solve alinear system for each, where
is the number of degrees of freedom of a single sphere (3 or 4 in our experiments). This property eliminates the need for any advanced solvers or factorizations and enables the implementation to be completely parallel over the spheres, as required for effective GPU acceleration. Additional performance optimization could be achieved by collecting the points for each sphere whose contribution is non-zero using accelerating data structures such as kd-trees[friedman1977algorithm].
4 Results and evaluation
We show medial representations generated by our method, in both 2D and 3D, throughout the paper and in Figure 15. As shown, our method produces valid medial representations for a wide variety of shapes and models, and in the presence of noise and outliers. Throughout the paper, we process all input with the same parameters whose values were set according to the analysis in Section 4.6. We evaluate our method in several ways. First, we consider variations of the inscription and maximality energy, and show how alternative formulations fail. Second, we evaluate the performance of our algorithm against ground truth on synthetic benchmarks contaminated by noise and outliers. Third, we compare our generated medial representations against the state-of-the-art in both 2D and 3D, again in the presence of noise. Finally, we provide an analysis of our algorithm parameters.
4.1 Variants of inscription energy – Figure 7
We consider two modified formulations of Equation 7 in which we either penalize the squared distance to points, or the squared distance to half-planes as opposed to our blended formulation. We investigate this behavior by randomly initializing the algorithm, and snapping to either zero or one for all points. When , point-to-point energy is used and the algorithm attempts to make spheres empty in a least square sense. However, nothing prevents the optimization from generating maximal spheres outside the shape in the ambient space. When , point-to-plane energy is used, and the algorithm attempts to make half-spaces empty. As illustrated in Figure 7 and the supplemental video, this results in difficulties for the algorithm in dealing with sharp concavities. In the example above, we expect all centers to cluster to one of the two centers, but spheres in the neighborhood of the sharp concavity might read the normal of a point on the opposite side, with a halfplane requesting the radius of a sphere intersecting with it to be significantly smaller. This problem is caused by the fact that point-to-point and point-to-plane energies approximate the squared SDF of a point set respectively in the far and near field. Our LSMAT formulation respects this geometric property, and deals with both issues at once.
4.2 Variants of maximality energy – Figure 8
We consider two alternative formulations of the maximality condition in our optimization: penalizing the squared inverse of the sphere radius, expressed as , or directly specifying , the maximum size of a medial sphere, and optimizing . While intuitively these are potentially feasible solutions, they incur a significant limitation: the amount of “pressure” a medial sphere will apply will be dependent on its size. That is, small spheres will be associated with a higher energy level. For example, using the first formulation, as its gradient will tend to ; as illustrated in Figure 8, this can cause spheres in the neighborhood of small features to bulge out. Our solution, detailed in Equation 2, does not encounter this problem, as our energy is constant regardless of the sphere size.
4.3 Quantitative evaluation metric – Figure 9
We consider ground truth geometry polluted by noise, and evaluate the quality of an extracted axis by computing the distance of each medial center to the closest point on the ground truth axis :
We compute via the medial axis module of the python skimage package, derived from the ridges of the distance transform from the ground truth boundary . As we discretize images with a bounding box at a resolution, the “numerical precision” of the metrics above is of one pixel. To obtain scale invariance, we report these errors in relation to the diagonal of the bounding box. In Figure 9, we visualize the convergence of our iterative optimization scheme.
4.4 State-of-the-art comparisons in 2D – Figure 10
Through the metric from Section 4.3, we quantitatively evaluate the performance of LSMAT against local filtering methods, as well as the global scale axis transform. Figure 2 shows how neither distance nor angle filtering is very effective; hence we employ a compound variant where we first filter by distance with , and then by angle . For the scale axis, we set , and for our method, we use our default parameters. We gradually increment the level of noise, and plot the corresponding error metric in Figure 10, as well as a few example frames from the plot. Note that LSMAT correctly captures the shape of the ground truth boundary, whereas all other local methods fail.
4.5 State-of-the-art Comparisons in 3D – Figure 11
We qualitatively evaluate the performance of LSMAT in three dimensions by comparing our results to those generated by the SAT [scaleaxis3d], QMAT [qmat], and sphere-shrinking [bubble] methods. To compare with methods that expect a mesh in input, we finely re-triangulate the surface, and apply normal displacement perturbation with relative to the diagonal bounding box. On the fertility model, LSMAT produces a convincing medial axis representation even when the input oriented point cloud is affected by extreme noise – i.e. with a magnitude close to the one of the local feature size. For the vase model, LSMAT still produces a smoother and more faithful medial axis representation than the existing state-of-the-art. Even in the presence of minor amounts of noise (second row), LSMAT faithfully produces a medial axis representation with a smooth surface, and whose skeleton resembles that of the uncorrupted model. The results are particularly encouraging on the horse dataset, where we attempted to use a very small number of target primitives for QMAT () and a large value of scale for SAT (). Our formulation is the only one capable of computing a relatively noiseless arrangement of medial spheres. The timings for our LSMAT and sphere shrinking results can be found in Table 1. All experiments were run on a machine with an Intel Xeon E5-1650 CPU and an Nvidia GTX 1080 GPU.
|Model ()||LSMAT||Sphere Shrinking|
4.6 Algorithm Parameter Analysis – Figure 12
Our algorithm depends on five parameters: the kernel sizes and , the relative weight , pinning distance , and radius expansion constant . The effect of varying the first four given different levels of noise is shown in Figure 12. Note we only consider the ratio between the , as the two energies balance each other. As increases, spheres near sharp concave corners begin to shrink as they eventually use the point-to-plane distance for all points in their neighborhood. When grows significantly beyond the local feature size, the MLS formulation is no longer able to recover a meaningful surface. As increases and the relative importance of maximality versus inscription increases, spheres expand until they no longer form a faithful medial representation and ultimately escape the point cloud. Finally, as grows, the spheres are allowed to slide further from their starting positions towards areas of locally maximal radius. We assume that estimates of the input noise characteristics are available, and choose our default value for each of these four parameters based on empirically derived linear functions of ; see Appendix B for details. Through experiments we observed that it is sufficient to choose a constant value for the fifth parameter , as different values of it merely change the optimal relation between and . For all our experiments we use . We believe improved formulations of LSMAT could eventually coalesce some of these parameters hence improving the ease of use of our algorithm.
5 Future works
Expressing the Medial Axis Transform as a non-linear least squares problem opens up several interesting avenues for future research.
Robustness to outliers – Figure 13
Least squares problems can be interpreted as a maximum-likelihood (ML) estimation given a Gaussianprobability distribution of the noise variables. If the input is corrupted by other forms of noise, one could replace Gaussian with other error distributions, and derive the corresponding ML optimization scheme. For example, if we assume the probability distribution of the noise to be Laplacian, the least-squares problems would simply be transformed into an (i.e. least norm) optimization. However, these type of problems can still be computed with Gauss-Newton type methods by using iteratively re-weighted least squares (IRLS) techniques [irls]. As illustrated in Figure 13 this results in an IR-LSMAT algorithm that can cope with significant amounts of outliers. While these results are promising, the convergence speed of the optimization is severely reduced, as a much smaller value for was needed to produce these results. The generalizability of LSMAT to robust norms [sparseicp] is particularly interesting. More specifically, consider the optimization problem consisting only of the following energy:
For this simplified version of the sphere-fitting problem is convex, hence generating a single solution regardless of initialization. However, as the problem is non-convex and the local minima reached by optimization depends on the initialization. Our preliminary investigation revealed how these local minima correspond to spheres belonging to the symmetry set, a superset of the medial axis; see Figure 1(d) and [skelstar, Sec. 2.1.4]. How to exploit (12) to efficiently compute the MAT of a point set is an interesting venue for future works.
Smoothness priors – Figure 14
Our pinning formulation is highly efficient, but its local nature can be also considered a intrinsic limitation. For example, in the noisy maple leaf example in Figure 15, at times the estimated medial spheres could be over and/or under-estimated in size, resulting in medial centers that do not necessarily sample the underlying piecewise-smooth manifold of the MAT. However, if our input is a sampling of a smooth surface , then we know that [matbook]: ① the medial centers should be lying on a piecewise smooth manifold, and ② the sphere radius function on this manifold should vary smoothly. Another desirable characteristic might be to have a uniform sampling of this manifold. We can convert these priors into least-squares energies, resulting in a maximum a-posteriori optimization. In Figure 14 we illustrate a few iterations of this optimization on the moon shape, where the “pinning” constraints from Section 3.2 have been disabled. While the optimization behaves as expected, this suffers similar shortcomings to those illustrated in Figure 6, and would result in a single sphere if executed for . Modifying the variational LSMAT formulation to obtain a regularly sampled distribution of medial centers is an interesting venue for future works.
In our experiments, we initialize the optimization with random sphere positions and radii. Nonetheless, given how a smooth object is composed of smooth piecewise manifolds , and a smooth radius function defined thereon, one could easily envision a locally bootstrapped version of the algorithm, where unsolved medial balls are initialized with the of their neighbors – which could also be re-interpreted as a multi-scale solver. Notice that techniques such as the sphere-shrinking algorithm from [bubble] do not permit this type of acceleration.
Shape approximation vs. reconstruction
A number of methods leverage the medial axis for interpolatory reconstruction [ireconstar], and our work is a stepping stone towards the creation of approximating reconstruction [reconstar] algorithms based on the medial axis. Methods such as Medial Meshes [medialmesh] and QMAT [qmat] assume a reconstruction of the watertight surface is already available, and attempt to extract its approximation via swept-spheres. Conversely, our work could be extended to directly compute a reconstruction of the input point cloud by minimizing data fitting metrics based on Hausdorff distances [medialmesh] or spherical quadrics [qmat]. A “topology surgery” step could then be interweaved with our optimization to stitch medial spheres together and create a medial mesh with connectivity information.
Orienting a point set
Finally, while our formulation is based on an oriented point set and an approximation of its SDF in the near/far field, an interesting variant of our algorithm could consider an unoriented point set, where the quantities to be optimized for would be the radii , and contact plane of a pair of twin spheres. The contact point should then be then be optimized on the manifold , while the complementary outside/inside label of each sphere be optimized to result in a smooth signing of the environment space. This approach would then provide a “medial axis” analogous to recent efforts in variational reconstruction of non-oriented point clouds [signing].
We have introduced the Least Squares Medial Axis Transform, or LSMAT, a continuous relaxation of the medial axis transform that is not only stable, but also robust to high levels of noise even though it is based solely on local optimization. While in most of the paper we visualized the generated maximal spheres covering more or less the entire shape, we would like to remind the reader that the algorithm operates on each sphere independently; the algorithm is therefore trivially parallelizable and particularly suitable for GPU implementations. Our method produces results on noisy inputs that state-of-the-art methods fail to handle, without a reliance on ad-hoc postprocessing. Our approach is efficient, parallelizable, and therefore suitable for real time applications where reliable medial representations are required, and where captured inputs are likely to be noisy.
We would like to express our gratitude to Mathieu Desbrun, Leonidas Guibas, Justin Solomon, Keenan Crane, and especially Sofien Bouaziz for their ideas and suggestions. In our experiments we use models provided courtesy of [chen2009benchmark] and the AIM@SHAPE Shape Repository.
Appendix A Gradients of Energy Components
Gradients are written in the form .
Given oriented surface point :
Given pin point :
Appendix B Empirical Parameter Defaults
Default value of :
Default value of and (both the same):
Default value of :