Thin_Vessel_Segmentation
C++ implementation of vesselness measure (or vesselness filter). Both CPU and GPU version are available.
view repo
Many applications in vision require estimation of thin structures such as boundary edges, surfaces, roads, blood vessels, neurons, etc. Unlike most previous approaches, we simultaneously detect and delineate thin structures with sub-pixel localization and real-valued orientation estimation. This is an ill-posed problem that requires regularization. We propose an objective function combining detection likelihoods with a prior minimizing curvature of the center-lines or surfaces. Unlike simple block-coordinate descent, we develop a novel algorithm that is able to perform joint optimization of location and detection variables more effectively. Our lower bound optimization algorithm applies to quadratic or absolute curvature. The proposed early vision framework is sufficiently general and it can be used in many higher-level applications. We illustrate the advantage of our approach on a range of 2D and 3D examples.
READ FULL TEXT VIEW PDFC++ implementation of vesselness measure (or vesselness filter). Both CPU and GPU version are available.
A large amount of work in computer vision is devoted to estimation of structures like edges, center-lines, or surfaces for fitting thin objects such as intensity boundaries, blood vessels, neural axons, roads, or point clouds. This paper is focused on the general concept of a
center-line, which could be defined in different ways. For example, Canny approach to edge detection implicitly defines a center-line as a “ridge” of intensity gradients [6]. Standard methods for shape skeletons define medial axis as singularities of a distance map from a given object boundary [35, 34]. In the context of thin objects like edges, vessels, etc, we consider a center-line to be a smooth curve minimizing orthogonal projection errors for the points of the thin structure.We study curvature of the center-line as a regularization criteria for its inference. In general, curvature is actively discussed in the context of thin structures. For example, it is well known that curvature of the object boundary has significant effect on the medial axis [17, 35]. In contrast, we are directly concerned with curvature of the center-line, not the curvature of the object boundary. Moreover, we do not assume that the boundary of a thin structure (e.g. vessel or road) is given. Detection variables are estimated simultaneously with the center-line. This paper proposes a general energy formulation and an optimization algorithm for detection and subpixel delineation of thin structures based on curvature regularization.
Curvature is a natural regularizer for thin structures and it has been widely explored in the past. In the context of image segmentation with second-order smoothness it was studied by [31, 37, 32, 5, 14, 28, 25]. It is also a popular second-order prior in stereo or multi-view-reconstruction [20, 27, 40]. Curvature has been used inside connectivity measures for analysis of diffusion MRI [24]. Curvature is also widely used for inpainting [3, 7] and edge completion [13, 39, 2]. For example, stochastic completion field technique in [39, 24]
estimates probability that a completed/extrapolated curve passes any given point assuming it is a random walk with bias to straight paths. Note that common edge completion methods use existing edge detectors as an input for the algorithm.
In contrast to these prior works, this paper proposes a general low-level regularization framework for detecting thin structures with accurate estimation of location and orientation. In contrast to [39, 13, 24] we explicitly minimize the integral of curvature along the estimated thin structure. Unlike [12] we do not use curvature for grouping pre-detected thin structures, we use curvature as a regularizer during the detection stage.
(a) Olsson’s model [26] | (b) Our model for cloud of points | (c) Our model for grid points |
Related work: Our regularization framework is based on the curvature estimation formula proposed by Olsson et al. [26, 27] in the context of surface fitting to point clouds for multi-view reconstruction, see Fig.2(a). One assumption in [26, 27]
is that the data points are noisy readings of the surface. While the method allows outliers, their formulation is focused on estimation of local surface patches. Our work can be seen as a generalization to detection problems where majority of the data points, e.g. image pixels in Fig.
2(c), are not within a thin structure. In addition to local tangents, our method estimates probability that the point is a part of the thin structure. Section 2 discusses in details this and other significant differences from the formulation in [26, 27].Assuming and are neighboring points on a thin structure, a curve, Olsson et al. [26] evaluate local curvature as follows. Let and be the tangents to the curve at points and . Then the authors propose the following approximation for the absolute curvature
and for the squared curvature
where is the distance between point and line .
Assume that the curve is parameterized by arc-length such that . If is an increasing parameter sequence then the curvature of can be approximated by
where is a neighborhood system for curve points and are their tangent lines.
Olsson et al. [26] use regularization for fitting a surface (or curve) to a cloud of points in 3D (or 2D) space. Every observed point is treated as a noisy measurement of some unknown point that is the closest point on the estimated surface, see Fig.2(a). Each is associated with unknown local surface patch that is a tangent plane for the surface at . The proposed surface fitting energy combines curvature-based regularization with the first order data fidelity term
(1) |
where is the set of tangents, is a neighborhood system, is non-negative constant, is a positive constant such that . To minimize (1), the algorithm in [26] iteratively optimizes the assignment variables for a limited number of tangent proposals, and then re-estimates tangent plane parameters, see Fig.2(a).
In contrast to [26], our method estimates thin structures in the image grid where, a priori, it is unknown which pixels belong to the thin structure, see Fig.2(c). We introduce set of indicator variables where iff pixel belongs to the thin structure. Our basic energy (2) and its extensions combine unary detection potentials with curvature regularization. Due to the regularity of our grid neighborhood, we use constant weights , which are omitted from now on. We propose a different optimization technique estimating a posteriori distribution of and separate tangents at each point. As illustrated in Fig.2(b), our framework is also applicable to energy (1) and multi-view reconstruction problem as in [26, 27].
Parent&Zucker [29] formulate a closely related trace inference problem for detecting curves in 2D grid. Similarly to us, they estimate indicator variables and tangents . However, they estimate and by enforcing a co-circularity constraint assuming given local curvature information, which they estimate in advance. In contrast, we simultaneously estimate and by optimizing objective (2) that directly regularizes curvature of the underlying thin structure. Moreover, [29] quantizes curvature information and tangents while our model uses real valued curvature and tangents. The extension of [29] to 3D is not trivial.
Similarly to [26, 29] we estimate tangents only at a finite set of points. Additional regularization is required if continuous center-line between these points is needed [16].
Contributions: It is known that curvature of an object boundary is an important shape descriptor [33] with a significant effect on medial axis [17, 35], which is not robust even to minor perturbations of the boundary. In the context of thin objects (e.g. edges, vessels) we study a concept of a center-line (a smooth 1D curve minimizing the sum of projection errors), which is different from medial axis. We regularize the curvature of the center-line. Unlike many standard methods for center-lines, we do not assume that the shape of the object is given and propose a general low-level vision framework for thin structure detection combined with sub-pixel localization and real-valued orientation of its center-line. Therefore, we propose an approach that takes into account all possible configurations of the indicator variables while estimating the tangents. This significantly improves stability with respect to local minima. Our optimization method uses variational inference and trust region frameworks adapted to absolute and quadratic curvature regularization.
Our proof-of-the-concept experiments demonstrate encouraging results in the context of edge and vessel detection in 2D and 3D images. In particular, we obtain promising results for estimating highly detailed vessels structure on high-resolution microscopy CT volumes. We also show examples of sub-pixel edge detection regularizing curvature. While there are no databases for comparing edge detectors with real-valued location and orientation estimation, we obtained competitive results on a pixel-level edge detection benchmark [11]. Our general early vision methodology can be integrated into higher semantic level boundary detection techniques, e.g. [22], but this is outside the scope of this work. Our current sequential implementation is not tuned to optimize performance. Its running time for edges in 2D image of Fig.1 is 20 seconds and for vessels in 3D volume of Fig.10 is one day. However, our method is highly-parallelizable on GPU and fast real-time performance on 2D images can be achieved.
In Section 2 we describe the proposed model and discuss a simple block-coordinate descent optimization algorithm and its drawbacks. In Section 3 we propose a new optimization method for our energy based on variational inference framework. In Section 4 we describe the details of the proposed method and discuss the difference between squared and absolute curvatures (Subsection 4.1). We describe several applications of the proposed framework in Section 5 and conclude in Section 6.
In the introduction we informally defined the center-line of a thin structure as a smooth curve minimizing orthogonal projection errors. Here we present the energy formalizing this criterion. First we note that in our model the curve is not defined explicitly but through points it passes and tangent lines at these points. The energy is given by
(2) |
where is a neighborhood system, is a set of indicator variables where iff pixel belongs to the thin structure, define unary potentials penalizing/rewarding presence of the structure at . In contrast to (1), potentials define the data term while is a soft constraint.
We explore two choices of the soft constraint
. The first one uses Euclidean distance. In that case it models normally distributed errors. Although it is appropriate for many applications, e.g. surface estimation in multi-view reconstruction
[26, 27], the normal errors assumption is no longer valid for the image grid because the discretization errors are not Gaussian. In fact, using Euclidean distance may make the soft constraint term proportional to the length of the center-line, see illustration on the right.Thus, we also propose a truncated form of Euclidean distance:
(3) |
This does not penalize tangent lines that are within one pixel from points . Different applications may require a different choice of no-penalty threshold.
We can extend the energy (2) by adding other terms that encourage various other useful properties. For example, energy
(4) |
for will reward well aligned tangents. The effect of this term is shown in Fig.6. This term is similar to edge “repulsion” in MRF-based segmentation. The overall pairwise potential encourages edge continuity.
Another extension is to use prior knowledge about the center-line direction at pixel :
(5) |
The term measures how well tangent line is aligned with prior :
The magnitude of
constitutes the confidence measure. For example, vectors
could be obtained from the image gradients or the eigenvectors in the vesselness measure
[9].To motivate our optimization approach for energy (2) described in Section 3, first we describe a simpler optimization algorithm and discuss its drawbacks.
The most obvious way to optimize energy (2) is a block-coordinate descent. The optimization alternates two steps described in Alg.1. The auxiliary energy optimized on line 4 is a non-linear least square problem and can be optimized by a trust-region approach, see Section 4. The auxiliary function on line 5 is a non-submodular binary pairwise energy that can be optimized with TRWS[18].
We found that Alg.1 is extremely sensitive to local minima, see Fig.3. The reason is that tangents for points with indicator variables do not participate in optimization on line 4
. To improve performance of block-coordinate descent, we tried heuristics to extrapolate tangents into such regions. We found that good heuristics should have the following properties:
1. Since integral of curvature is sensitive to small local errors (see the figure on the right), the extrapolating procedure should yield close tangents for neighbors. Otherwise step 5 of the algorithm is ineffective. This issue could be partially solved by using energy (4). In this case it can be beneficial to connect two tangents even if there is some misalignment error.
2. The heuristic should envision that some currently disconnected curves may lie on the same center-line, see Fig.3.
The first property was easy to incorporate, while the second would require sophisticated edge continuation methods, e.g. a stochastic completion field [39, 24]. Instead we develop a new optimization procedure (Section 3) based on variational inference. The advantage of our new procedure is that it is closer to joint optimization of and .
Ideally, we wish to jointly optimize (2) with respect to all variables. This is a mixed integer non-linear problem with an enormous number of variables. Thus, it is intractable. However, we can introduce elements of joint optimization based on stochastic variational inference framework. The proposed approach takes into account all possible configurations of indicator variables while estimating tangents . This significantly improves stability w.r.t. local minima.
Energy (2) corresponds to a Gibbs distribution:
where is a normalization constant and the image is given by data fidelity terms . Here are visible variables, indicator variables and tangents are hidden ones.
We add a prime sign for tangent notation to distinguish values of random variables and parameters of the distribution.
Our goal is to approximate the posterior distribution of unobserved (hidden) indicators and tangents given image . The problem of approximating the posterior distribution has been extensively studied and is known as variational inference [4].Variational inference is based on the decomposition
(6) |
where is the evidence, is a distribution over the hidden variables, is the posterior distribution, and
(7) | ||||
(8) |
Since
(Kullback–Leibler divergence) is always non-negative, the functional
is a lower bound for the evidence . One of the nice properties of this decomposition is that the global maximum of lower bound coincides with the global minimum of and optimal is equal to the true posterior [4].Unfortunately (7) cannot be optimized exactly. To make optimization tractable, in variational inference framework one assumes that belongs to a family of suitable distributions. In this work we will assume that is a factorized distribution (mean field theory [30]):
(9) | ||||
(10) | ||||
(11) |
where is a deterministic (degenerate) distribution with parameter . Under this assumption lower bound functional becomes a function of parameters and . We denote this function where and .
The proposed algorithm is defined by Alg.2. It optimizes lower bound in block-coordinate fashion. The algorithm returns optimal tangents , see Fig.5(b), and optimal probabilities , see Fig.5(c).
Now we consider optimization of over . Taking into account (11), (2) and (7) we can derive
(12) |
where
In case of (4) we redefine , and in case of (5) we redefine .
We see that optimization of with respect to is a non-linear least square problem. For optimization details please refer to Section 4.
The optimization w.r.t. can be done by coordinate descent. The update equation is [4]
(13) |
How should we treat and in this equations?
The constant in expression (13) does not depend on and thus can be determined from the normalization equation We initialize on line 1 of Alg.2. We iterate over all pixels update step (13) on line 5 until convergence, which is guaranteed by convexity of with respect to each [4].
Note that if we further restrict to be a degenerate distribution (meaning ) we will get the block-coordinate descend Alg.1.
The initialization of is application dependent. In many cases some information about direction of a thin structure is available. Concrete initialization examples are described in Section 5.
The goal of Alg.1 is to find , which is equivalent to
(14) |
As shown in Section 2 optimization of 14 in a block-coordinate fashion requires optimization of tangents with fixed indicator variables . This necessitates extrapolation of tangents. Instead we propose to optimize taking into account all possible configurations of . That is we propose to replace maximum with smooth maximum:
Then we can write down a decomposition similar to (6), which provides a lower bound yielding the same optimization procedure.
The proposed procedure is closely related to the EM algorithm[8] where we treat tangents as the parameters of the distribution. However, in this case the normalization constant of the distribution depends on and optimization problem is intractable. One possible way to fix this issue is to use a pseudo likelihood [21].
Optimization of the auxiliary functions on line 4 of Algorithms 1 and 2 as well as energy (1) is a non-linear least square problem. In [26, 27] energy (1) is optimized using discrete multi-label approach in the context of surface approximation. In our work we adopt the inexact Levenberg-Marquardt method in [41], which is a trust region second order continuous iterative optimization method.
Each iteration consists of several steps. First, the method linearizes:
where for compact notation we define and . We use [1] for automatic calculation of derivatives.
Second, the algorithm solves the minimization problem
where is a positive damping factor, which determines the trust region. The method uses an inexact iterative algorithm for this task.
The last stage of iteration is to compare the predicted energy change with the actual energy change . Depending on the result of comparison the method updates variables and damping factor . For more details please refer to [41].
The most computationally expensive part of Alg.2 is trust region optimization described in this subsection. From the technical point of view it consists of derivatives computation and basic linear algebra operations. Fortunately, these operations could be easily parallelized on GPU. We leave the GPU implementation for a future work.
Previous sections assume squared curvature, but everything can be adapted to the absolute curvature too. We only need to discuss how to optimize (12) for the absolute curvature. We use the following approximation:
(15) |
where
and is some non-negative constant. If we have an approximation of the absolute curvature, if we have an approximation of the squared curvature.
The trust region approach (see Section 4) works with approximations of functions. It does not require any particular approximation like in the Levenberg-Marquardt method [19, 41]. Thus we can approximate the absolute curvature by treating as constants in (15) and linearizing analogously to the squared curvature case.
The approximation of curvature given by [26] is derived under the assumption that the angles between neighbor tangents are small. Under this assumption the sine of an angle is approximately equal to the angle. And the approximation essentially computes the sines of the angles rather than the angles themselves. As a result it significantly underestimates the curvature of sharp corners.
For example, let us consider the integral of absolute curvature over a circle and a square. The integral of the approximation is and correspondingly, while the integral of the true absolute curvature is in both cases. So the energy using this approximation of absolute curvature tends to distribute curvature into a small number of sharp corners showing strong bias to straight lines. Although approximation of squared curvature also underestimates curvature of sharp corners, it does not have a strong bias to straight lines. See figures 1 and 4 for comparison of the approximations.
(a) Original image | (b) The output of algorithm | (c) Probabilities | (d) Subpixel probabilities |
Here we consider an application of our method to edge detection and real-valued edge localization.
Sobel gradient operator [36] returns the gradient magnitude and direction for every image pixel. The high gradient magnitude is an evidence of a contrast edge. The direction of the gradient is a probable direction of the edge. We use the output of the gradient operator to define data fidelity terms of energy (4). For every pixel let be the gradient vector returned by the operator. We normalize vectors
by the sample variance of their magnitudes over the whole image. We define likelihood
using hand picked linear transformation of the gradient magnitude:
. These parameters were optimized on a single picture shown in Fig.5(a). The initial tangents (line 1 of Alg.2) are collinear with gradients and pass through pixels .(a) | (b) |
The results in figures 1, 5-9 was obtained by optimizing energy (4) using (3) as a soft constraint, with parameters , and 8-connected neighborhood system .
According to our model pixel is a noisy measurement of point on a contrast edge. Denoised point is the projection of onto . To generate an edge mask (possibly at higher resolution) we can quantize and use as values at quantized . If during this process we have a conflict such that several points are quantized into same pixel we choose the one with maximum probability. Fig.5(d) shows an edge mask whose resolution was doubled. Fig.7 shows examples of the edge mask at the original resolution.
We also compared our results with a few edge detection algorithms whose result is an edge mask, see Fig.9. This shows that our general method achieves F-measure of , which is very close to F-measure of , given by the best evaluated algorithm in [38]. Please note that [38] was designed specifically for edge detection in images, while our approach is a generic method for thin structure delineation.
Vessel center-line localization in 3D medical volumes is an important task for medical diagnostics and pre-clinical drug trials. The sentence looks like an orphan
For the experiments in this section we used a microscopic computer tomography [10, 15] scan of the mouse’s heart. The scan is a 3D volume of size 585x525x892. For the both experiments the volume was preprocessed with a popular vessel detection filter of [9]. For every voxel the filter returns vesselness measure such that higher values of indicate higher likelihood of vessel presence at . The filter also estimates direction and scale of a vessel.
For this application we use extension (5) of energy (2). Coefficient in front of the soft constraint in the energy determines how far tangents can move from voxels . Since this data has high variability in vessel thickness, we cannot use the same for every voxel. We substitute produced by the vesselness filter for in energy (5):
where is a positive constant and is obtained from vesselness measure by the same linear transformation that we use in Section 5.1. We set and and use 26-connected neighborhood system .
For the first experiment we cropped the volume forming a subvolume of size 81x187x173. We also removed 85% of voxels with the lowest values of . That yields about variables to be optimized. Fig.10 shows the result.
The goal of the second experiment is to extract a few trees describing the cardiovascular system of the whole heart. To decrease the running time we perform Canny’s [6] hysteresis thresholding to detect one-dimensional ridges in the volume. We substitute vesselness measure for intensity gradients in Canny’s procedure. Then we set for voxels detected as ridges and for other voxels. This yields approximately the same number of optimization variables. Then we optimize tangents by the algorithm described in Section 4. Then the estimated center-line points are grouped based on the tangent and proximity information into a graph and a minimum spanning tree algorithm extracts the trees. The result is shown in Fig.11.
We present a novel general early-vision framework for simultaneous detection and delineation of thin structures with sub-pixel localization and real-valued orientation estimation. The proposed energy combines likelihoods, indicator (detection) variables and squared or absolute curvature regularization. We present an algorithm that optimizes localization and orientation variables considering all possible configuration of indicator variables. We discuss the properties of the proposed energy and demonstrate a wide applicability of the framework on 2D and 3D examples.
In the future, we plan to explore better curvature approximations, extend our framework to image segmentation with curvature regularization, and improve the running time by developing parallel GPU implementation.
We thank Olga Veksler (University of Western Ontario) for insightful discussions.
Curvature regularity for region-based image segmentation and inpainting: A linear programming relaxation.
In ICCV, Kyoto, 2009.