Image analysis measurements are generally performed on binary representations of the objects. However, when images are acquired, grey levels have specific meanings. Binarisation of such images results in a loss of information and neither the internal intensities nor the borders of the resulting regions represent the imaged objects very well. This can be due to limited resolution, high noise levels, or that the border is a compound of objects. Because of this, measurements are increasingly done directly on grey-level images mg:SladojePAMI2009 . Fuzzy theory mg:fuzzy:ZadehIC1965 , where an image element has a membership value describing its belongingness to a certain (fuzzy) object, has emerged as a framework for addressing these problems mg:CarvalhoPAA1999 ; mg:UdupaGMIP1996 .
Distance calculations are widely used to extract shape and size information mg:BorgeforsAVFP1994 ; mg:JonesTVCG2006 . This is an area where measurements in grey-level images have become increasingly popular mg:BlochPR2000 ; mg:GRAYMAT:LeviIC1970 ; mg:PreteuxSPIE1991 ; mg:GWD:RutovitzPPR1968 ; mg:FDT:SahaCVIU2002 ; mg:SoillePRL1994 ; mg:ToivanenPRL1996 . The applications for content-based distance measures are many, e.g., grey-level morphology and minimal path detection mg:SoillePRL1994 , segmentation mg:MeyerSP1994 , clustering mg:GeddaIWCIA2006 , and solving the Eikonal equation mg:KimmelJMIV1996 ; mg:Sethian1996 ; mg:VerbeekPRL1990 . With the expanding size of datasets and demand for real time response for both automatic and interactive applications, improving memory efficiency and runtime for algorithms with excessive computational requirements has become a focus of greater importance mg:CotoCG2007 ; mg:MalmbergDGCI2006 ; mg:NyulGM2003 . Due to hardware limitations, the first efficient methods for computing distance transforms were based on the classic raster scan approach mg:SeqOp:RosenfeldJACM1966 . This approach works well for distance calculations on binary images, where a complete distance transform only needs two passes through the image. But for grey-level images, where the domain is generally not convex, the number of passes through the image becomes dependent on content. To improve runtime, propagation using graph-search techniques have become popular mg:FalcaoTMI2000 ; mg:IkonenDGCI2005 ; mg:VerbeekPRL1990 . Most of the methods are versions of the well-known, theoretically optimal Dijkstra’s algorithm mg:DijkstraNM1959 . The wealth of data structures available for these algorithms makes analysing the computational complexity of all different combinations nontrivial. Even if it was trivial, implementations of lowest complexity might still not be the fastest due to practical implications. For example, the work by Luengo mg:LuengoIJCV2009 shows the impact of current computer hardware on different priority queues. Also, special situations that often arise in image analysis problems, such as spatial homogeneity in images or overhead of complex structures when working on small problem domains, can also be a factor to why implementations with higher complexity might perform better than ones with low complexity.
Kimmel et al. mg:KimmelJMIV1996 calculated the grey-weighted distance by solving the Eikonal equation using the Fast Marching method (FMM) mg:Sethian1996 , which is an efficient numerical scheme for solving the continuous boundary value problems. Here we focus on discrete distance definitions, covered in Section 2, and compare algorithms aimed to find the shortest path in a network with prescribed weights for each link between nodes. Numerical methods for approximating the solutions of a continuous problems are out of scope of this paper.
We put different implementations of the most popular grey-weighted distance transform algorithms, which we cover in Section 3, in a comparative test, under settings representative of common situations in image analysis, in Section 4. The work by Nyul et al. mg:NyulGM2003 presents a similar study on algorithms for fuzzy-connected image segmentation. However, it is important to point out that the results do not apply to grey-weighted distance transforms due to the different properties of fuzzy connectedness. Since our work relates to the same subject we have chosen to use similar terminology. We incorporate all algorithms and data structures used by Nyul et al. mg:NyulGM2003 and also include the data structures introduced by Yatziv et al. mg:YatzivJCP2006 and Luengo mg:LuengoIJCV2009 . The conclusions in Section 5 should be seen as practical guidelines for selecting grey-weighted distance algorithms in different scenarios. To keep the adherence of the guidelines from being overly complex and off-putting, we have chosen competitive algorithms of low programming complexity and use containers from the C++ Standard Template Library (STL) mg:Josuttis1999 where applicable. We focus on local sequential algorithms, all parallel algorithms are out of scope for this article.
2 Discrete grey-weighted distances
The geodesic distance between two points included in a set is the length of the shortest paths or geodesics mg:SerraIAMM1988 linking these points and included in the set. The set is referred to as a geodesic mask, and when calculating grey-weighted distance the grey-scale geodesic mask is usually the same as the input image. In this paper we define a discrete grey-scale image as an application of a subset of the -dimensional discrete space into the set of non-negative real numbers. The neighbourhood relations between the points in a discrete image are defined by a graph. We use an 8-connected graph for 2D square grids, and a 26-connected graph for 3D cubic grids. We define a discrete path of length going from node to node as a -tuple of nodes such that , , and defines adjacent nodes for all . The grey-weighted distance represents the sum of all arc weights along . This assumes that the arc weight represents the cost of travelling from a node to node . The grey-weighted distance then consists of finding the path with the lowest sum of arc weights along all possible paths linking to . If the set consists of all possible paths from to , we have
where the arc weight , also referred to as cost or local cost, is calculated by a cost function on the geodesic mask . Although the definition is general for dimensions we refer to image elements as pixels (or nodes when utilising the graph analogy) unless we operate on 3D images, where we refer to them as voxels.
Rutovitz first proposed a grey-weighted distance where the arc weight is equal to the grey level of the destination pixel of each step along the path mg:GWD:RutovitzPPR1968 . Levi and Montanari extended this definition when they defined a grey-weighted medial axis transform (GRAYMAT) by weighting the grey levels with the distance between adjacent pixels along the path mg:GRAYMAT:LeviIC1970 . In their definition, the length of a path is defined as the discretisation of the integral of the pixel values along the path, and the arc weight is defined as
where refers to the spatial distance between two adjacent nodes in the image graph. Saha et al. proposed a theoretical framework for the GRAYMAT definition when applied to fuzzy sets mg:FDT:SahaCVIU2002 . Soille also defined a geodesic measure for fuzzy sets inspired by Levi and Montanari’s definition mg:SoillePRL1994 . For more distance definitions on fuzzy sets we refer to mg:BlochPR1999 .
Toivanen proposed two definitions for arc weights where the path between two points is defined as a path lying on the hyperplane defined by the grey levelsmg:ToivanenPRL1996 . The first is the distance on curved space (DOCS),
and the second is the weighted distance on curved space (WDOCS),
While GRAYMAT propagates fast for low grey levels, DOCS and WDOCS account for the changes in height of the ’height map’ and represent the minimal amount of ascents and descents to be travelled to reach a neighbouring pixel. DOCS performs the distance calculation with integer numbers while each sub-distance along the path for WDOCS is euclidean.
Figure 1 shows the different grey-weighted distance function behaviours. The top row shows the grey-weighted distance transforms when using a gradient image as geodesic mask. The GRAYMAT transform progress rapidly across the area of low grey levels in the top left corner while DOCS and WDOCS have their fastest progression when travelling normal to the gradient direction. The bottom row shows the transforms on a sinusoidal image with saturated intensities. Once again it is clear that GRAYMAT moves fast in areas of low grey level (the black rings) and slow in areas of high grey level (the white rings). DOCS and WDOCS, on the other hand, move fast in both areas with high and low grey level but move slower through the transitions between two uniform areas, where the difference in grey levels results in increased costs. A more detailed analysis of the behaviour of the different transforms is presented in mg:fouardDGCI2006 .
When referring to a grey-weighted distance computation it can be in one of four settings: (i) grey-weighted transform (or transform for short), (ii) seeded grey-weighted transform (or seeded transform for short), (iii) grey-weighted dilation (or dilation for short), or (iv) route. In the seeded (or marker-based) distance transform, each pixel is the grey-weighted distance of the lowest cost path from a set of predefined pixels generally referred to as seeds, markers or features. The other three settings, (i), (iii), and (iv) are all special cases of the seeded transform. In the distance transform, each pixel is the grey-weighted distance from the background. Generally the background is defined as , i.e., all pixels with grey level zero in the geodesic mask, which is the same as a seeded grey-weighted transform where . Dilation is a seeded transform where the seeds represent the region to be dilated and the calculation is stopped when a predefined grey-weighted distance is reached. The result is analogous to a morphological grey-level dilation of where the structuring element is defined by the cost function. The route is a single source shortest path problem calculating the grey-weighted distance from a single pixel to a single pixel . This is done by seeded transform from the single seed point and terminating the transform once is reached.
The seeded transform offers better options for experimental setups than the regular transform by facilitating multiple runs on the same image using different seeds. However, the evaluation can be used as a guideline for choosing an algorithm for both the seeded and unseeded transform since the algorithms are the same. The runtime of a dilation or a route is naturally lower than that of calculating the transform of an entire image. However, the aim of this work is not to illustrate the low computational costs of various methods but to compare various implementations of grey-level–based distance computations, which is shown more clearly for complete (or near complete) image transforms than for dilations and routes.
The first method to calculate grey-weighted distance transforms was to use the chamfer scan approach, e.g., see mg:GRAYMAT:LeviIC1970 . The algorithm uses a window containing a weight mask (chamfer mask), and is slided across the image, updating the central pixel at each position. The scan consists of a forward pass and a backward pass. Figure 2 shows the masks used for the forward and backward passes for both 2D and 3D images. The chamfer weights are typically , , and . In contrast with distances from binary images, the domain is usually not convex. Therefore, the chamfer algorithm for grey-weighted distance is an iterative process and has to be repeated until no updates are made in the distance map.
The chamfer algorithm can be considered an algorithm of the label-correcting
kind. A grey-weighted distance label is assigned to a pixel at each step; the grey-weighted distance labels are estimates (i.e., a upper bounds on) the grey-weighted distance of the lowest cost path from the source to the individual pixels. What characterises a label-correcting algorithm is that all labels are considered temporary until the final step, when they all become permanent.
A different approach from iterative raster scan in the chamfer algorithm is the graph search approach. Two simple graph search approaches are the depth-first search (DFS) mg:Algorithms:Cormen1990 (referred to as recursive propagation) and the breadth-first search (BFS) mg:Algorithms:Cormen1990 (referred to as ordered propagation), used by Silvela et al. for distance transform computations mg:SilvelaITIP2001 . The algorithm for both approaches is listed in Algorithm 1. They are both label-correcting algorithms and the difference between them comes from how pixels are added and removed from the list . For recursive propagation, is a last-in-first-out (LIFO) list, i.e., a list where the last pixel added is the first to be removed. For ordered propagation, is a first-in-first-out (FIFO) list, i.e., a list where the pixel added first is the first to be removed.
The opposite of label-correcting algorithms are algorithms of the label-setting kind. A label-setting algorithm assigns one label as permanent (optimal) at each iteration. The algorithms of this group are basically various implementations of Dijkstra’s well known algorithm mg:DijkstraNM1959 , first proposed for grey-weighted distance transforms as the uniform cost algorithm mg:UCADT:VerwerPAMI1989 . In graph search terminology it is referred to as best-first search since the best alternative is considered at every iteration. The label-setting algorithms are much more efficient than label-correcting algorithms, but they are applicable only to special situations like region growing scenarios. Recently much work on grey-weighted distance computations has been done using various label-setting implementations mg:FalcaoTMI2000 ; mg:IkonenDGCI2005 . Even though all of them use the theoretically optimal Dijkstra’s algorithm, they differ in what data structures they use. The label-setting algorithm used is listed in Algorithm 2.
3.1 Local cost computation
The structure of the distance map will depend on the image connectivity and spatial distance between adjacent nodes. Common choices for spatial distance in a local neighbourhood are city block, chessboard, 3-4-5, or one of the optimal chamfer weights designed to approximate the Euclidean distance over large distances mg:Chamfer:BorgeforsCVGIP1986 ; mg:BorgeforsCVIU1996 ; mg:VerwerPRL1991 . However, the relative efficiency of the various algorithms is unlikely to vary with different choices. We choose to only use the common 3-4-5 chamfer weights, since one of the algorithms is only applicable to integer costs, and, as previously mentioned, 8-connectivity for two-dimensional images and 26-connectivity for three-dimensional images.
This section presents the details of the different strategies and data structures used for the test cases in Section 4. The methods are labelled using capital letters with subscripts representing the properties of the method/data structure (see Table 1 for a summary). These labels will be used throughout Section 4.
3.2.1 Label-correcting algorithms
The implementation of the chamfer algorithm does not give much room for variation. The chamfer method iterates over the image through raster scans using the chamfer mask shown in Figure 2, and is given the label C.
For recursive and ordered propagation we use variations of Algorithm 1. Step 8 puts all neighbours of on the list if is updated. This will result in lots of duplicates on the list and unnecessary pop operations. To improve the speed of the algorithm, a pointer array can be used to keep track of whether a pixel is already on the list. If a pixel is already on the list, it need not be duplicated. The propagation algorithms are given the label P with the subscript L for LIFO list (recursive propagation) and F for FIFO list (ordered propagation). The subscript A is used if a pointer array is used to keep track of which pixels are on the list.
3.2.2 Label-setting algorithms using -heap
Roughness in images was computed by Ikonen et al. mg:IkonenDGCI2005 using the DOCS transform with a binary heap mg:Algorithms:Cormen1990 (-ary heap with ). The algorithm used is the same as Algorithm 2 but without Step 10, i.e., no check whether the neighbour is already on the queue . This results in duplicates on the queue, which leads to unnecessary pop operations. Here we represent the priority queue in Algorithm 2 by a -heap, both without any helper structures, as Ikonen in mg:IkonenDGCI2005 , and with some helper structures to keep track of the pixels on the queue. The key of a pixel in is the grey-weighted distance at the time it is inserted into . Since low grey-weighted distance values have priority over high values, all priority queues used in this work are minimum priority queues, i.e., the root stores the element with the smallest key. Step 4 is the extract-min operation, which finds the smallest key and removes it from the heap, and the update in Step 11 is the decrease-key operation, which increases the node priority.
We use for all -heaps in the experiments. See Section 4.3 for a more detailed discussion on selecting . In the first implementation using -heap, we always insert a new instance of in Step 11 and 13, like in mg:IkonenDGCI2005 , even if it means duplication. This algorithm is labelled H (for heap). In another implementation (labelled H) we use a pointer array, which for every pixel stores the position of in the heap or NULL indicating that is not on the heap. The pointer array is used in Step 10 to check if a pixel is on the queue, and in Step 11 to update the priority (decrease-key). The final group using -heaps implement hash tables instead of a pointer array to use less memory. They use hash tables with various hash functions and various table sizes to keep track of the positions of the pixels in the heap. However, keeping a hash table requires additional computation, and hash tables work differently depending on the distribution of the key values assigned to the input data (i.e., depending on the grey-weighted distance values and on the geometric structures of objects in the image).
We use the same geometric hash functions used by Nyul et al. for 3D images mg:NyulGM2003 . They assign a key value to a heap pixel using the following equations:
where , , are the coordinates of the voxel, height and width are the dimensions of a slice, is the size of the hash table, and is the bit-wise exclusive or operation. The corresponding hash functions for 2D are acquired by simply removing all instances of and ’height’ from the above equations.
) we use a hash table size of 512 in the 2D case and 8191 in the 3D case. The range of possible hash values is quite large and these sizes are reasonably small and result in a fairly uniform distribution of hash keys in our application. For Equation (5) we use a hash table size of 512 for the 2D case and 768 for the 3D case. For Equation (7) we use a size of 256. This gives us a separate hash bin for each combination of 8-bit coordinates. We use labels H, H, H, and H, respectively, for the algorithms corresponding to Equations (4)-(7).
3.2.3 Label-setting algorithms using Fibonacci heap
The Fibonacci heap mg:Algorithms:Cormen1990 is a more complicated structure than the -heap. While the -heap supports in worst-case time the operations insert, extract-min, decrease-key and delete, the Fibonacci heap supports the same operations but have the advantage that operations that do not involve deleting an element run in amortised time mg:Algorithms:Cormen1990 . This means that insert (Step 13) and decrease-key (Step 11) on average run in constant time. In theory the Fibonacci heap is especially desirable when the number of extract-min and delete is small relative to the number of other operations performed. However, the constant factors and programming complexity of Fibonacci heaps make them less desirable than ordinary -heaps for most applications mg:Algorithms:Cormen1990 . Despite the high programming complexity of the Fibonacci heap we have chosen to incorporate it for comparison purposes.
Like for -heaps, the first version (labelled F for Fibonacci) does not keep track on whether a pixel is already on the heap, and we do not perform a search in the heap for a pixel already stored. This means that we always insert a new instance of the pixel, even if it means duplication. In another version (labelled F) we use a pointer array to keep track of pixels on the heap. The final version (labelled F) use the hash table that performed best among Equations (4)-(7) in Section 4.5 (Equation (5)), with the same table size as was used for the -heap.
3.2.4 Label-setting algorithms using circular bucket queues
Falcao et al. used an efficient implementation of Dijkstra’s shortest path algorithm for grey-weighted distance mg:FalcaoTMI2000 . It was the circular bucket queue data structure for integer costs introduced by Dial mg:DialCACM1969 . The priority queue is represented by a circular array where every position (bucket) in the array holds a doubly linked list of all nodes with equal path cost. This is illustrated in Figure 3 (a), where the priority queue is represented as an array of buckets containing the nodes in , with being the maximal possible arc weight. Each bucket stores a list of all nodes whose path cost is equal to . The drawback with Dial’s queue is that it only works with integer values, i.e., in our case we can only run it with the DOCS cost function. We test Dial’s bucket queue with both LIFO and FIFO lists, labelled D and D respectively. The LIFO version is also implemented using a pointer array to keep track of whether a pixel is on the queue or not. The pointer array implementation is labelled D.
Yatziv et al. have proposed the Untidy queue mg:YatzivJCP2006 . It is a circular bucket queue like Dial’s, but the number of buckets can be chosen freely and each bucket stores all nodes whose path cost fall within a certain interval. If the number of buckets (referred to as bucket size) is , with being the highest bucket index, then the bucket number is determined by
where is the path cost of node and is the maximal possible arc weight. It follows that each bucket will hold all nodes whose path cost is on the interval
This means that the Untidy priority queue works on floating point values as well as integer values. However, since the queue can put nodes with different costs in the same bucket, the Untidy queue will not necessarily return the node with the lowest path cost from a find-min operation. This introduces a rounding error which can be bound by choosing a large bucket size. See mg:RaschJNA2009 for a more detailed analysis of the rounding error. Here we choose the bucket size 261 for DOCS and 2551 for GRAYMAT and WDOCS to ensure a low error. See Section 4.4 for a more detailed discussion on selecting bucket sizes. The labels for the Untidy queue implementations are U, U, and U, where the subscripts refer to the same properties as for the Dial queue implementations.
The final circular bucket queue implementation used is the hierarchical heap proposed by Luengo mg:LuengoIJCV2009 . The circular array functions the same way as for the Untidy queue, i.e., each bucket holds pixels with costs on a given interval. However, instead of representing each bucket with a list the hierarchical heap uses a -heap at each bucket. This introduces the extra runtime of the heap compared to a list, but it functions properly with real valued costs, i.e., it will always return the node with the lowest path cost from a find-min operation. The same bucket sizes were used as for the Untidy queue. The hierarchical heap was implemented in two versions. One without a pointer array to keep track of which pixels are on the queue (labelled HH) and one with a pointer array (labelled HH).
Table 1 summarises the various methods and their associated labels.
|Label||Method||Data structure||P. array||Hashing|
|H||Best-first||d-ary heap||Eq. (4)|
|H||Best-first||d-ary heap||Eq. (5)|
|H||Best-first||d-ary heap||Eq. (6)|
|H||Best-first||d-ary heap||Eq. (7)|
|F||Best-first||Fibonacci heap||Eq. (5)|
3.2.5 Technical details
For comparison and memory-conserving purposes, all data structures use dynamic containers. Each time a node is pushed on a queue the node is created on-the-fly and put on the queue represented by a dynamically allocated data structure. We have used containers from the C++ Standard Template Library (STL) mg:Josuttis1999
where applicable to ensure a low programming complexity. This will make the algorithms easier to implement and, thus, encourage image analysts to adopt to the guidelines provided. The depth-first and breadth-first lists use the STL List container. The d-ary heap and hierarchical heap use the STL Vector container (the STL heap does not support the update in Step 11 of Algorithm2). Figure 3 (b) illustrates how an array is used to store a heap. Dial’s bucket queue and the Untidy bucket queue are both implemented using an STL List container for each bucket. When a pointer array is used to keep track of nodes on the queue, the queue needs to allow for random access to facilitate updates (Step 11 of Algorithm 2). Standard list containers do not support random access, which means that updating a node on the Dial or Untidy queue is associated with a search through the list occupied by the node. This search will most certainly result in higher runtime compared to using a customised list implementation allowing random access.
The execution time can be cut by implementing custom containers using static arrays or intelligent reallocation based on application-specific memory usage. However, such custom implementations are memory consuming and have high programming complexity, and algorithms based on such implementations are not easy to adopt. For example, implementing the Dial queue using a static array, which was done by Falcao et al. mg:FalcaoTMI2000 , is not only nontrivial compared to using STL lists, but also memory consuming. Since the static array needs to be the same size as the image and contain two pointers per element, the static array will use at least twice the memory of the distance map. See mg:CotoCG2007 for a more detailed discussion on the memory issue for static arrays in region growing algorithms. Because of both the high programming complexity and the large memory requirements, static array implementations are out of scope for this article. However, we include runtimes for static array implementations of the Dial queue and the Untidy queue in Section 4.6.3 to give an indication of the tradeoff between dynamically allocated priority queues and priority queues using static arrays.
4 Tests and results
In this section we do comparative tests of the algorithms described in Section 3 on both 2D and 3D data with varying image properties. All tests were carried out on an Intel Xeon CPU 3.60GHz 64-bit Dual core computer with 4 GB RAM running Red Hat Enterprise Linux 5 update 2. The algorithms were compiled with gcc v3.4 using the -O3 optimisation flag and no advantage was taken of the multiple core architecture of the CPU.
The algorithms were evaluated on images of various complexity to mimic the conditions common in image analysis problems. In the 2D case, the performance of the algorithms was compared using the 8-bit grey-level images seen in Figure 4. The image in Figure 4 (a) represents noisy images with high complexity and low spatial correlation between pixels; Figure 4 (b) is the image pout.tif from MATLAB™(The MathWorks, Natick, MA, USA) and represents images of varying complexity with some spatial correlation; Figure 4 (c) represents low complexity images with large uniform areas and, thus, high spatial correlation. Each image in Figure 4 was 9601164 pixels and are referred to as NOISE, POUT, and BALL, respectively. A test point grid was used for seeded grey-weighted distance transforms, see the black dots shown in Figure 4 (c). The test point grid contain 49 points spread evenly, but not symmetrically, over the entire image area. In the 3D case, the performance of the algorithms was compared on an 8-bit gradient magnitude image of the 256256100 computed tomography (CT) image covering the liver region of an abdomen, shown in Figure 5 (a), and a 512512154 contrast enhanced magnetic resonance angiography (CE-MRA) image of an abdomen, shown in Figure 5 (b).
The CT image was used by Vidholm et al. mg:VidholmMICCAI2006 to segment livers semi-automatically by seeded FMM. The CT images are abdominal contrast enhanced venous phase CT images of a patient with either carcinoid or endocrine pancreas tumour. The images were acquired with a Siemens Sensation 16 CT scanner. The CE-MRA image was used by Vidholm et al. mg:VidholmISBI2004 for semi-automatic segmentation with haptic guided seeding. The image was acquired from a 1.5T Gyroscan Intera (Philips Medical Systems) using the standard body coil and a specially built table top extender. The sequence was a 3DRF-spoiled gradient echo with TR/TE/flip angle=. The dataset consisted of four subvolumes: the head and upper thorax, the lower thorax and abdomen, the pelvis and upper legs, and the lower legs.
4.2 Precomputed arc weights
When calculating multiple transforms on the same image and the local cost function is complex, it might be favourable to precompute all arc weights to decrease the computational cost of each step. In these cases, precomputing arc weights can generally increase the efficiency at the cost of memory. If the arc weights are bi-directional, the look-up table of a 26-connected image will require roughly 13 times as much memory as the image. However, for grey-weighted distances the cost function is not very complex. Even if there is sufficient memory, the overhead from accessing the look-up table might be of the same magnitude as evaluating the cost function, in part because of cache misses. Table 2 shows a sample of runtimes for two images when using both precomputed weights and when computing the weights on-the-fly.
|Image||Alg.||Definition||Pre. ()||Transf. ()||Total ()|
In Table 2 we see that the total runtime is higher for all cases where precomputed arc weights are used. For the three-dimensional case, the transform itself takes even longer to compute when using precomputed arc weights. There is apparently an overhead associated with using a (large) lookup table for the arc weight which is greater than computing the arc weight on-the-fly. Since the lookup table is memory consuming and the numbers in Table 2 indicate that there is no gain in computation speed, we do not recommend precomputing arc weights and all experiments in Section 4 calculate the arc weights on-the-fly.
4.3 Different -heaps
Table 3 lists the runtimes for -heaps with different . It is apparent that the binary heap () performs best. It should be noted that the implementation differs slightly for the two cases and . For the binary case (), each element a[i] has children a[2i], a[2i+1], and parent a[floor(i/2)] (the root is a and tree elements are a a[n]). See Figure 3 (b) for an illustration. For general -heaps the child nodes are a[d*(i-1)+2] through a[d*(i)+1], and the parent a[floor((i-2)/d)+1]. Because the -heap gets shallower with increasing the insert operation will be faster but the delete-min operation will be more expensive due to the need of comparisons to find the smallest child node. It appears that for grey-weighted distance transforms, the slower delete-min operations, in combination with the extra multiplications and divisions by for the parent/child index calculation, makes the binary heap the preferred choice over any .
4.4 Bucket sizes
Bucket sizes should be chosen so that a good tradeoff is reached between the number of buckets and the number of pixels in each bucket. Dial’s queue defaults to bucket size as mentioned in Section 3.2.4. This assigns one bucket to each possible path cost. For the Untidy queue and the hierarchical heap, which both can handle floating point path costs, the right amount of buckets comes from a combination of grey-weighted distance definition, chamfer weight and actual runtime. To ensure a unique bucket for each possible path cost, we have from Equation (9) that the bucket size must be chosen as,
where is the maximum possible arc weight and is the minimum possible difference in path cost. Table 4 lists the required number of buckets for each grey-weighted distance definition when using 8-bit images and 3-4-5 chamfer weights.
|Min path cost diff||0.5||1||0.0137|
|Buckets (unique cost)||2551||261||18000|
4.4.1 Untidy queue
The runtime for different bucket sizes is shown in Figure 6. The DOCS column shows the runtimes for bucket sizes , the GRAYMAT column for bucket sizes , and the WDOCS column for bucket sizes . Each value is the mean runtime over 5 transforms from the same seed.
For GRAYMAT and DOCS we can choose the bucket size required for unique costs (see Table 4) and remain confident that the runtime will be kept low. For WDOCS, the required bucket size is more than 18000. Figure 6 show that WDOCS has a slight increase in runtime for increasing bucket size. A good choice for bucket size should, thus, be lower than 18000 when considering the runtime. If accuracy of the grey-weighted distance map is not a vital issue, then selecting the number of buckets required for WDOCS can be simplified by quantising the arc weights produced by Equation (3
). Here we limit our analysis to non-quantised arc weights. Using fewer buckets than required for unique costs might introduce rounding errors. However, to result in a rounding error two spatially close nodes with different path costs have to end up in the same bucket and get extracted in the wrong order. The probability of this happening is likely to be quite low and depends on both the grey-weighted distance definition and image properties. Figure7 shows plots of the percent of erroneous values we get in the distance map for different bucket sizes. The plots only show one transform, but it gives an indication that most errors drop off quite fast. To get a more reliable measure on how few buckets can be used without getting an erroneous distance map, we used 5 different seeds from the seed map and recorded the largest bucket size which resulted in an error for each seed. Table 5
lists the mean and standard deviation over the five seeds for each setting. For DOCS we used bucket sizes, and for GRAYMAT and WDOCS we used . The table indicates that bucket sizes much smaller than can be chosen for WDOCS without introducing errors in the distance transform. Since Figure 6 shows that the choice bucket size for WDOCS is robust with respect to runtime, we use the same number of buckets as was chosen for GRAYMAT (2551) for simplicity.
4.4.2 Hierarchical heap
Unlike the Untidy queue, the hierarchical heap always returns the node with the lowest path cost from a find-min operation, independent of bucket size. Therefore, choosing the bucket size for the hierarchical queue can be based on runtime alone. Using less buckets than required for unique path costs will lead to larger heaps since nodes with different costs might have to share buckets. Using more buckets will introduce excessive buckets which cannot get any nodes assigned to them. This will only make the circular array longer and take more time to sift through. The runtime for different bucket sizes is shown in Figure 8.
The DOCS definition shows a decline in runtime which levels out around 250, which makes 261, the bucket size required for unique path costs, the most reasonable choice of bucket size for the hierarchical heap when using DOCS. The GRAYMAT definition behaves similarly to the DOCS definition for NOISE and POUT, decreasing in runtime and levelling out when getting close to the bucket size required for unique path costs. For BALL the runtime seems to increase slightly with bucket size. However, the difference in runtime for small versus large bucket sizes is so low that the behaviour for NOISE and POUT is more important. Therefore, the bucket size required for unique path costs (2551) is a good choice for the hierarchical heap when using GRAYMAT. Since Figure 8 indicates that the choice of bucket size for WDOCS is robust with respect to runtime, except for NOISE where it increases slightly for high bucket sizes, we use the same number of buckets as was chosen for GRAYMAT (2551) for simplicity.
4.4.3 Cost spread
Another interesting aspect of the bucket queues is the spread of the queue over time. The spread shows how many nodes occupy each bucket at a specific time during the calculation of the grey-weighted distance transform. Figure 9 shows the spread for a transform, using bucket size , for each image type and each grey-weighted distance definition. The queue was sampled at every 10’000th iteration, and each row represents the state of the queue at the time (from top to bottom), with every pixel being one bucket. The values ranges from dark blue (empty bucket) to red (maximum number of nodes). The max, min, mean, and std percentages are statistics on the number of non-empty buckets. The figure also shows the mean spread as a bar diagram for each image.
It is clear that the noisier image result in a wider range of arc weights, giving rise to a wider spread, while both POUT and BALL give a low spread for DOCS and WDOCS. This means that for DOCS and WDOCS, a large increase in bucket size will only have a small effect on the number of buckets used, which can be one of the explanations why the runtime does not vary much between different bucket sizes.
The thin spread for DOCS and WDOCS can be utilised by removing the unused part of the circular queue to get an increased spread. A wider spread will likely lead to shallower heaps for small bucket sizes in the hierarchical heap, and decreased rounding errors for small bucket sizes in the Untidy queue, which in turn can lead to shorter runtimes. Note however that a more thorough statistical analysis of the spread needs to be done to draw any conclusions on how much the circular array can be shortened for the different cases. Such an analysis is out of scope for this article.
4.5 Tests on 2D images
Each of the 49 points in the test point grid in Figure 4 (c) was used as a seed in a seeded grey-weighted distance transform of each of the 2D images. Table 6 shows the average runtimes in seconds. When running the experiments, it became apparent that the recursive propagation algorithm (labels P and P) expands pixels in an order which is extremely ineffective for grey-weighted distance computations. The pixel removed in Step 4 in Algorithm 1 is always the pixel most recently added to the list . This has the effect that the algorithm starts by expanding pixels along a path from the seed throughout the image, calculating numerous grey-weighted values based on the one seed neighbour at the beginning of the path. Since grey-weighted distances is monotonically increasing from the seed and depend on the local neighbourhood, expanding its neighbours (which is done at some point when the recursive propagation passes by on its way back for P or has expanded all subsequent pixels in the image for P) will most likely make the initial path calculation redundant. For example, the recursive propagation algorithm P took 4031 seconds to compute one GRAYMAT on a 240291 version of NOISE, visiting each pixel more than 37000 times on average. The chamfer algorithm, in comparison, took 8.16 seconds on average (standard deviation ) and visited each pixel an average of 42.6 times. Because of this, the recursive propagation algorithms were excluded from any further experiments and do not show up in the table.
In Table 6 we see that the overall best performers were the label-setting algorithm using the binary heap with pointer array (H), the Dial queue (D and D), the Untidy queue (U and U), and the hierarchical heap with pointer array (HH). It is apparent that the label-correcting algorithms are a poor choice due to their habit of visiting each node multiple times. The binary heap with pointer array was the fastest in two out of the nine cases; when calculating the WDOCS transform on the NOISE and POUT images. The hierarchical heap without pointer array was the fastest in two cases; when calculating the GRAYMAT and DOCS transform on the BALL image. The hierarchical heap with pointer array was the fastest in the rest of the cases and tied with the binary heap with pointer array when calculating the WDOCS transform in the NOISE image. The hash functions clearly have too much overhead and the algorithms with hash tables perform worse than they do without any helper structures.
It is apparent that the Fibonacci heap performs worse than the binary heap. The lower, amortised, time complexity of the Fibonacci heap does not beat the binary heap for grey-weighted distance computations. This indicates that the number of extract-min and delete operations is high relative to the number of other operations performed on the heap.
The differences in runtime between the Dial queue and the Untidy queue are small, which was anticipated considering their similar structure. The performance of the Dial queue is slightly better than the Untidy queue, most likely due to the extra computation required by the Untidy queue to determine which bucket to put a node (Equation (8)). We can conclude that a LIFO list in the Dial or Untidy queue results in a slightly better runtime than a FIFO list. This is likely to stem from the fact that requires a LIFO list requires fewer operations than a FIFO list.
4.6 Tests on 3D images
Here we used the CT image with size 256256100, shown in Figure 5 (a), to compare the fastest algorithms from Section 4.5. This was done by computing grey-weighted distance transforms from manually placed seeds in a gradient magnitude image. The gradient magnitude image can be considered a low complexity image with large uniform areas and has high spatial correlation between voxels, i.e., properties similar to the BALL image in Section 4.5. Two transforms were calculated, one from seeds placed outside the liver and one from seeds placed inside the liver, and the average runtime for each setup is shown in Table 7.
Here it becomes apparent that the Dial queue and Untidy queue do not work well with the pointer array since updating a value is associated with a search through a list, which can be quite large due to the limited spread discussed in Section 4.4.3. For any of the grey-weighted distance transforms on the gradient magnitude of a CT image, the fastest algorithm is the hierarchical heap with the pointer array, but only marginally so for the DOCS definition. It is also noteworthy that the binary heap with pointer array performs better than the Untidy queue for both the GRAYMAT and WDOCS definitions.
Here we used the ’lower thorax and abdomen’ subvolume with size 512512154, shown in Figure 5 (b), to compare the algorithms used in Section 4.6.1. This was done by computing one distance transform tracing a route through the iliac running down the left leg and terminating at the bottom of the image. The runtime for each setup is shown in Table 7.
The runtimes for the CE-MRA dataset show results similar to the CT dataset. The fastest algorithm is the hierarchical heap with pointer array, and the runner up is the binary heap with pointer array.
4.6.3 Dynamic vs static priority queues
In addition to the dynamic versions of the Dial and Untidy queue that use STL List containers, both priority queues were also implemented using static arrays to give an indication of the tradeoff between static and dynamic priority queues. As mentioned in Section 3.2.5, a static array implementation has a high programming complexity compared to using the STL List. It also requires at least twice as much memory as the distance map while a dynamic priority queue only requires enough space to accommodate the front wave (sometimes referred to as the ’narrow band’). Table 8 lists the runtimes for the binary heap with pointer array (H), Dial queue (D), Dial queue using static array (D), Untidy queue (U), Untidy queue using static array (U), and hierarchical heap with pointer array (HH). The static array implementations of the Dial and Untidy queue show an increase in performance by 5-33% compared to their dynamic implementations but only 0-11% compared to the fastest algorithms with dynamic priority queues. Note that when calculating the DOCS transform on the CE-MRA, the static array implementations ran slower than both the binary heap with pointer array and the hierarchical heap with pointer array. Considering the drawbacks with implementing static array priority queues, these runtimes indicate that dynamic priority queues can provide a fair tradeoff between performance and programming complexity/memory usage.
The performance variations do not motivate a different choice of algorithm for different grey-weighted distance definitions, i.e., if an algorithm performs well for one definition, it is likely to perform well for the other two. The same holds for the difference in image properties. For 2D images, the hierarchical heap using pointer array shows to be the best choice. The same choice also proves to be the best for 3D images ( 0-16% faster than the second best). However, if memory is a critical issue, i.e., if the pointer array helper structure is not an option, then the Dial queue is the best option if we work with integer costs, and the Untidy queue if we work with real valued costs. It is noteworthy that the popular binary heap with pointer array performs quite well compared to the more sophisticated priority queues ( 0-19% slower than the fastest algorithm).
The Department of Radiology, Uppsala University Hospital, is acknowledged for providing the CT and CE-MRA datasets. Stina Svensson, Robin Strand, Cris Luengo, and Filip Malmberg, Centre for Image Analysis, Uppsala, Sweden, are acknowledged for scientific support. Magnus Gedda is financially supported by the Swedish Research Council (project 621-2005-5540).
- (1) I. Bloch. Geodesic balls in a fuzzy set and fuzzy geodesic mathematical morphology. Pattern Recognition, 33(6):897–906, 2000.
- (2) I. Bloch. On fuzzy distances and their use in image processing under imprecision. Pattern Recognition, 32(11):1873–1895, 1999.
- (3) G. Borgefors. Applications using distance transforms. In Aspects of Visual Form Processing, pages 83–108, 1994.
- (4) G. Borgefors. Distance transformations in digital images. Computer Vision, Graphics, and Image Processing, 34:344–371, 1986.
- (5) G. Borgefors. On digital distance transforms in three dimensions. Computer Vision and Image Understanding, 64(3):368–376, 1996.
- (6) B. M. Carvalho, C. J. Gau, G. T. Herman, and T. Y. Kong. Algorithms for fuzzy segmentation. Pattern Analysis and Applications, 2:73–81, 1999.
- (7) T. H. Cormen, C. E. Leiserson, and R. L. Rivest. Introduction to Algorithms. The MIT Press, Cambridge, Massachusetts, USA, 1990.
- (8) E. Coto, S. Grimm, and D. Williams. O-Buffer based IFT watershed from markers for large medical datasets. Computers & Graphics, 31(6):848–863, 2007.
- (9) R. B. Dial. Algorithm 360: shortest-path forest with topological ordering. Communications of the ACM, 12(11):632–633, November 1969.
- (10) E. W. Dijkstra. A note on two problems in connexion with graphs. Numerische Mathematik, 1:269–271, 1959.
- (11) A. X. Falcao, J. K. Udupa, and F. K. Miyazawa. An ultra-fast user-steered image segmentation paradigm: Live wire on the fly. IEEE Transactions on Medical Imaging, 19(1):55–62, January 2000.
- (12) C. Fouard and M. Gedda. An objective comparison between gray weighted distance transforms and weighted distance transforms on curved spaces. In Proceedings of the 13th International Conference on Discrete Geometry for Computer Imagery, volume 4245 of Lecture Notes in Computer Science, pages 259–270. Springer, October 2006.
M. Gedda and S. Svensson.
Fuzzy distance based hierarchical clustering calculated using the Aalgorithm. In Proceedings of the 11th International Workshop on Combinatorial Image Analysis, volume 4040 of Lecture Notes in Computer Science, pages 101–115. Springer, 2006.
- (14) L. Ikonen. Pixel queue algorithm for geodesic distance transforms. In Proceedings 12th International Conference on Discrete Geometry for Computer Imagery, pages 228–239, 2005.
- (15) M.W. Jones, J.A. Baerentzen, and M. Sramek. 3D distance fields: a survey of techniques and applications. IEEE Transactions on Visualization and Computer Graphics, 12(4):581–599, 2006.
- (16) N. Josuttis. The C++ Standard Library. Addison Wesley, 1999.
- (17) R. Kimmel, N. Kiryati, and A. M. Bruckstein. Sub-pixel distance maps and weighted distance transforms. Journal och Mathematical Imaging and Vision, 6:223–233, 1996.
- (18) G. Levi and U. Montanari. A grey-weighted skeleton. Information and Control, 17:62–91, 1970.
- (19) C. L. Luengo Hendriks. Revisiting priority queues for image analysis. Submitted for publication, January 2009.
- (20) F. Malmberg, E. Vidholm, and I. Nyström. A 3D live-wire segmentation method for volume images using haptic interaction. In Proceedings International Conference on Discrete Geometry for Computer Imagery, volume 4245, pages 663–673, 2006.
- (21) F. Meyer. Topographic distance and watershed lines. Signal Processing, 38:113–125, 1994.
- (22) L. Nyul, A. X. Falcão, and J. K. Udupa. Fuzzy-connected 3D image segmentation at interactive speeds. Graphical Models, 64:259–281, 2003.
- (23) F. Preteux and N. Merlet. New concepts in mathematical morphology: the topographical distance functions. In Proceedings SPIE, volume 1568, pages 66–77, 1991.
- (24) C. Rasch and T. Satzger. Remarks on the O(N) implementation of the fast marching method. IMA Journal of Numerical Analysis, 29(3):806–813, 2009.
- (25) A. Rosenfeld and J. L. Pfaltz. Sequential operations in digital picture processing. Journal of the Association for Computing Machinery, 13(4):471–494, October 1966.
- (26) D. Rutovitz. Data structures for operations on digital images. In Pictorial Pattern Recognition, pages 105–133, Washington, 1968. Thompson.
- (27) P. K. Saha, F. W. Wehrli, and B. R. Gomberg. Fuzzy distance transform: theory, algorithms, and applications. Computer Vision and Image Understanding, 86(3):171–190, 2002.
- (28) J. Serra. Mathematical morphology for boolean lattices. Image Analysis and Mathematical Morphology, 2:37–58, 1988.
- (29) J. A. Sethian. Level Set Methods and Fast Marching Methods. Cambridge University Press, 1996.
- (30) J. Silvela and J. Portillo. Breadth-first search and its application to image processing problems. Image Processing, IEEE Transactions on, 10(8):1194–1199, Aug 2001.
- (31) N. Sladoje and J. Lindblad. High-precision boundary length estimation by utilizing gray-level information. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(2):357–363, 2009.
- (32) P. Soille. Generalized geodesy via geodesic time. Pattern Recognition Letters, 15(12):1235–1240, 1994.
- (33) P. J. Toivanen. New geodesic distance transforms for grey-scale images. Pattern Recognition Letters, 17(5):437–450, 1996.
- (34) J. K. Udupa and S. Samarasekera. Fuzzy connectedness and and object definition: Theory, algorithms, and applications in image segmentation. Graphical Models and Image Processing, 58:246–261, 1996.
- (35) P. W. Verbeek and B. J. H. Verwer. Shading from shape, the eikonal equation solved by grey-weighted distance transform. Pattern Recognition Letters, 11:681–690, October 1990.
- (36) B. H. Verwer, P. Verbeek, and S. Dekker. An efficient uniform cost algorithm applied to distance transforms. IEEE Transactions of Pattern Analysis and Machine Intelligence, 11(4):425–429, April 1989.
- (37) B. H. Verwer. Local distances for distance transformations in two and three dimensions. Pattern Recognition Letters, 12:671–682, 1991.
- (38) E. Vidholm, S. Nilsson, and I. Nyström. Fast and robust semi-automatic liver segmentation with haptic interaction. In Proceedings of MICCAI 2006 Lecture Notes in Computer Science, volume 4191, pages 774–781, 2006.
- (39) E. Vidholm, X. Tizon, I. Nyström, and E. Bengtsson. Haptic guided seeding of mra images for semi-automatic segmentation. In Proceedings 2nd IEEE International Symposium on Biomedical Imaging, pages 288–291, 2004.
- (40) L. Yatziv, A. Bartesaghi, and S. Guillermo. O(n) implementation of the fast marching algorithm. Journal of Computational Physics, 212:393–399, 2006.
- (41) L. A. Zadeh. Fuzzy sets. Information and Control, 8:338–353, 1965.