I Introduction
Angiography offers insights into blood flow and conditions of vascular network. Three dimensional volumetric angiography information can be obtained using magnetic resonance (MRA), ultrasound, or xray based technologies like computed tomography (CT). A common first step in analyzing these data is vessel segmentation. Still, moving from raw angiography images to vessel segmentation alone might not provide enough information for clinical use, and other vessel features like centerline, diameter, or bifurcations of the vessels are also needed to accurately extract information about the vascular network, for example, to characterize its structural properties of flow pattern. In this work, we present a deep learning approach, called DeepVesselNet, to perform vessel segmentation, centerline prediction, and bifurcation detection tasks. DeepVesselNet deals with challenges that result from speed and memory requirements, unbalanced class labels, and the difficulty of obtaining wellannotated data for curvilinear volumetric structures by addressing the following three key limitations.
Fast crosshair filters
Processing 3D medical volumes poses a memory consumption and speed challenge. Using 3D CNNs leads to drastic increase in number of parameters and computations compared to 2D CNNs. At the same time, applying a 2D CNN in a slicewise fashion discards valuable 3D context information that is crucial for tracking curvilinear structures in 3D. Inspired by the ideas of [1, 2, 3] who proposed separable filters and using of intersecting 2D planes, we demonstrate the use of crosshair filters from three intersecting 2D filters, which helps to avoid the memory and speed problems of classical 3D networks, while at the same time making use of 3D information in volumetric data. Unlike the existing ideas where 2D planes are extracted at a preprocessing stage and used as input channels (see discussion in Sec. IIIA), our crosshair filters (see Sec. IIIA) are implemented on a layer level which help retain the 3D information throughout the network.
Extreme class balancing
The vessel, centerline and bifurcation prediction tasks is characterized by high class imbalances. Vessels account for less than 3% of the total voxels in a patient volume, centerlines represent a fraction of the segmented vessels, and visible bifurcations are in the hundreds at best – even when dealing with volumes with and more voxels. This bias towards the background class is a common problem in medical data [4, 5, 6]. Unfortunately current class balancing loss functions for training CNNs turns out to be numerically unstable in extreme cases as ours. We offer a solution to this ’extreme class imbalance’ problem by introducing a new loss function (see Sec. IIIB) that we demonstrate to work well with our vascular features of interest.
Transfer learning from synthetic data
Manually annotating vessels, centerlines, and bifurcations requires many hours of work and expertise. To this end, we demonstrate the successful use of simulation based frameworks [7, 8] that can be used for generating synthetic data with accurate labels (see Sec. IIIC) for pretraining our networks, rendering the training of our supervised classification algorithm feasible.
Ii Prior Work and Open Challenges
Vessel Segmentation
Vessel enhancement and segmentation is a longstanding task in medical image analysis (see reviews by [9, 10]). The range of methods employed for vessel segmentation reflect the development of image processing during the past decades, some examples including region growing techniques [11], active contours [12], statistical and shape models [13, 14, 15, 16], particle filtering [17, 18, 19] and path tracing [20]. All of these examples are interactive, starting from a set of seed label as root and propagating towards the branches. Other approaches aim at an unsupervised enhancement of vascular structures: the multi scale second order local structure of an image (Hessian) was examined by [21]
with the purpose of developing a vessel enhancement filter. A measure of vessellikeliness is then obtained as a function of all eigenvalues of the Hessian. A novel curvilinear structure detector, called Optimally Oriented Flux (OOF) was proposed by
[22] to find an optimal axis on which image gradients are projected to compute the image gradient flux. OOF has a lower computational load than the calculation of the Hessian matrix proposed in [21]. A levelset segmentation approach with vesselnessdependent anisotropic energy weights is presented and evaluated in [23, 24] for 3D timeofflight (TOF) MRA. Phellan and Forkert [25]presented a comparative analysis of the accuracy gains in vessel segmentation generated by the use of nine vessel enhancement algorithms on timeofflight MRA that included multi scale vesselness algorithms, diffusionbased filters, and filters that enhance tubular shapes. A machine learning approach was followed by
[26], combining joint 3D vessel segmentation and centerline extraction using oblique Hough forest with steerable filters. In a similar fashion, [27]used deep artificial neural network as a pixel classifier to automatically segment neuronal structures in stacks of electron microscopy images, a task somewhat similar to vessel segmentation. One example using deep learning architecture is,
[28]who used a deep convolutional neural network to automatically segment the vessels of the brain in TOF MRA by extracting manually annotated bidimensional image patches in the axial, coronal, and sagittal directions as an input to the training process.
Centerline Prediction
Identifying the center of a vessel is relevant for calculating the vessel diameter, but also for extracting the ’skeleton’ of a vessel when extracting the vascular network (see Fig. 1). The vessels’ skeleton and center can be found by postprocessing a previously generated vessel segmentation or deal with centerline extraction in raw images with high vessel contrast. A method based on morphological operations is developed by [29] which performs erosion using neighborhoods of a pixel to determine if a pixel is a centerline candidate. Active contour models are applied in [30] as well as path planning and distance transforms for extracting centerline in vessels, and [31] proposed a geodesic or minimal path technique. A morphologyguided level set model is used in [32]
to performed centerline extraction by learning the structural patterns of a tubularlike object, and estimating the centerline of a tubular object as the path with minimal cost with respect to outward flux in gray level images. Zheng et al.
[33] adopted vesselness filters to predict the location of the centerline, while [34]used Houghs transforms. Hough random forest with local image filters is designed in
[26, 7] to predict the centerline, and trained on centerline data previously extracted using one of the level set approaches. The application of deep learning to the extraction of vessel centerline has not been explored. One reason may be the lack of annotated data necessary to train deep architectures that is hard to obtain especially in 3D datasets.Bifurcation Detection
Vessel bifurcation refers to the point on a vessel centerline where the vessel splits into two or more smaller vessels (see Fig. 1). Bifurcations represent the nodes of the vascular network and knowing their locations is important both for network extraction and for studying its properties [35]. They represent structures that can easily be used as landmarks in image registration, but also indicate the locations of modified blood flow velocity and pressure within the network itself [36]. Bifurcations are hard to detect in volumetric data – as they are rare pointlike features varying in size and shape significantly. Similar to centerline extraction, the detection of bifurcations often happens by postprocessing a previously generated vessels segmentation or by searching a previously extracted vessel graph. A two staged deep learning architecture is proposed in [37] for detecting carotid artery bifurcations as a specific landmark in volumetric CT data by first training a shallow network for predicting candidate regions followed by a sparse deep network for final prediction. A three stage algorithm for bifurcation detection is proposed in [36] for digital eye fundus images, a 2D task, and their approach included image enhancement, clustering, and searching the graph for bifurcations. The direct predicting of the location of bifurcations in a full volumetric data set is a task which has not been attempted yet. Bifurcations – and the connections in between them – are the nodes and, hence, the building blocks of a vascular network. To this end, we aim at a direct prediction of their location to offer new alternatives to a plain search of the potentially inaccurate network graph.
Convolutional Neural Networks for Angiography Analysis
The use of Convolutional Neural Networks (CNNs) for image segmentation has seen rapid progress during the past years. Originally aiming at predicting global labels or scores. CNNs are employed in [27] for pixelwise classification using as input the raw intensity values of a square window centered on the pixel of interest. Unfortunately, these patchwise architectures suffer from replication of information in memory, which limits the amount of training data that can be used during training and affects speed during testing. The problems were addressed in [38, 39] through inverting the downsampling and convolutional operations in CNN with upsampling and deconvolutional operators leading to a Fully Convolutional Neural Network (FCNN) which is trained endtoend for semantic segmentation. Based on this idea different variants of stateoftheart FCNN have been presented for medical image segmentation [5, 40, 41, 42, 43, 44, 45]. Most of these architectures were either employed on 2D images or extended to 3D volumes in a manner which leads to loss of 3D object information or to high memory needs. For example, [5] applied the UNET architecture proposed by [39] on 3D data in a 2D slicewise fashion (so called 2.5D CNNs), which does not account for interslice information. At the same time, full 3D architectures, such as the VNet by [41], come with the cost of increasing the number of parameters to be learned during training by at least a factor of three over 2D architectures of similar depth.
Iii Methodology
Iiia Crosshair Filters Formulation
In this section, we discuss the formulation of the 3D convolutional operator, which utilizes crosshair filters to improve speed and memory usage while maintaining accuracy. Let be a 3D volume, M a 3D convolutional kernel of shape , and be a convolutional operator. We define as:
(1)  
where is the greatest integer less or equal to . From Eq. (1), we see that a classical 3D convolution involves multiplications and additions for each voxel of the resulting image. For a kernel, we have multiplications and additions per voxel. Changing the kernel size to increases the complexity to multiplications and additions per voxel. This then scales up with the dimension of the input image. For example, a volume of size and a kernel results in about multiplications and additions. To handle this increased computational complexity, we approximate the standard 3D convolution operation by
where are 2D crosshair filters. Using crosshair filters results in multiplications and additions. If we let be the sizes of the kernel such that , we can show that
(3) 
where strict inequality holds for all . Eq. 3 shows a better scaling in speed and also in memory since the filters sizes in (1) and (IIIA) are affected by the same inequality. With the approximation in (IIIA), and using the same example as above (volume of size and a kernel), we now need less than multiplications and additions to compute the convolution leading to a reduction in computation by more than multiplications and additions when compared to a classical 3D convolution. Increasing the volume or kernel size, further increases the gap between the computational complexity of (1) and (IIIA). Moreover, we will see later from our experiments that (IIIA) still retains essential 3D context information needed for the classification task.
Efficient Implementation
In Eq. (IIIA), we presented our 2D crosshair filters. However, applying (IIIA) independently for each voxel (as defined in Eq. (IIIA)) leads to a redundant use of memory. More precisely, voxels close to each other share some neigbourhood information and making multiple copies of it is not memory efficient. We now present an efficient implementation, which we develop and use in our experiments (Fig. 3). Consider as defined in Eq. (1) and let us extract the sagital, coronal, and axial planes as , and respectively. By application of Eqs (1), and (IIIA), we have a final implementation as follows:
(4)  
where refers to a 2D convolution along the first and second axes of the left hand side matrix over all slices in the third axis and refers to our crosshair filter operation. This implementation is efficient in the sense that it makes use of one volume at a time instead of copies of the volume in memory where voxels share the same neighbourhood. In other words, we still have only one volume in memory but rather rotate the kernels to match the slices in the different orientaions. This lowers the memory requirements during training and inference, allowing to train on more data with little memory.
2.5D Networks vs. 3D Networks with Crosshair Filters
We end the presentation of crosshair filters by discussing the difference between existing 2.5D networks and our proposed crosshair filters. Given a 3D task (e.g. vessel segmentation in 3D volume) a 2.5D based network handles the task by considering one 2D slice at a time. More precisely, the network takes a 2D slice as input and classifies all pixels in this slice. This is repeated for each slice in the volume and the final results from the slices are fused again to form the 3D result. Other 2.5D methods include a preprocessing stage where several 2D planes are extracted and used as input channels to the 2D network [3, 2]. On the architecture level, 2.5D networks are 2D networks with a preprocessing method for extracting 2D slices and a postprocessing method for fusing 2D results into a 3D volume. We note that the predictions of 2.5D networks are solely based on 2D context information. Example of 2.5D networks is the implementation of UNet in [5] used for liver and lesion segmentation tasks in CT volumetric dataset and the network architecture of [44] for annotation of lumbar vertebrae.
On the other hand, 3D networks based on our proposed crosshair filters take the whole 3D volume as input and at each layer in the network, we apply the convolutional operator discussed in Section IIIA. Therefore, our filters make use of 3D context information at each convolutional layer and do not require specific preprocessing or post processing. Our proposed method differs from classical 3D networks in the sense that it uses less parameters and memory since it does not use full 3D convolutions. However, it is worth noting that our filters scale exactly the same as 2.5D (i.e. in only two directions) with respect to changes in filter and volume sizes. More precisely, given a square or cubic filter of size , we have parameters in a 2.5D network and in our crosshair filter based network. Increasing the filter size by a factor of will scale up as quadratically in both situations (i.e. for 2.5D and in crosshair filter case) as compared to full 3D networks where the parameter size scales as a cube of .
Unlike the existing 2.5D ideas where 2D planes are extracted at a preprocessing stage and used as input channels to a 2D network architecture, our crosshair filters are implemented on a layer level which help retain the 3D information throughout the network making it a preferred option when detecting curvilinear objects in 3D.
IiiB Extreme Class Balancing with Stable Weights
Often in medical image analysis, the object of interest (e.g. vessel, tumor etc.) accounts for a minority of the total voxels of the image. Fig. 4 shows that the objects of interest in the datasets used in this work account for less than 2.5% of the voxels (the different datasets are described in Section IVA). Using a standard cross entropy loss function given by
(5)  
where
is the probability of obtaining the ground truth label given the data
and network weights , is the label for the th example, is the feature set, is the set of parameters of the network, is the set of positive labels, and is the set of negative (background) labels, could cause the training process to be biased towards detecting background voxels at the expense of the object of interest. This normally results in predictions with high precision against low recall. To remedy this problem, [46] proposed a biased sampling loss function for training multi scale convolutional neural networks for a contour detection task. This loss function introduced additional tradeoff parameters and then samples twice more edge patches than nonedge ones for positive costsensitive finetuning, and vice versa, for negative costsensitive finetuning. Based on this, [47] proposed a classbalancing cross entropy loss function of the form(6)  
where W
denotes the standard set of parameters of the network, which are trained with backpropagation and
and are the class weighting multipliers, which are calculated as , . ’s are the probabilities from the final layer of the network, and and are the set of positive and negative class labels respectively. This idea, which is to give more weight to the cost associated with the class with the lowest count, has been used in other recent works [5, 40, 42, 43]. However, our experiments (in Section B) show that the above loss function raises two main challenges:Numerical Instability
The gradient computation is numerically unstable for very big training sets due to the high values taken by the loss. More precisely, there is a factor of , that scales the final sum to the mean cost in the standard crossentropy loss function in Eq. (IIIB). This factor ensures that the gradients are stable irrespective of the size of the training data . However, in Eq. (6), the weights and do not scale the cost to the mean. For high values of (usually the case of voxelwise tasks), the sums explode leading to numerical instability. For example, given a perfectly balanced data, we have , irrespective of the value of . Thus, increasing the size of the dataset (batch size) has no effect on the weights (). However, the number of elements in the sums increases, causing the computations to be unstable.
High False Positive Rate
We observe a high rate of false positives leading to high recall values. This is caused by the fact that in most cases the object of interest accounts for less than 5% of the total voxels (about 2.5 % in our case). Therefore, we have a situation where , which implies that wrongly predicting background voxels as foreground is less penalized in the loss than predicting foreground voxels as background. This leads to high false positive rate and, hence, high recall values.
To address these challenges, we introduce different weighting ratios and an additional factor to take care of the high false positive rate; and define:
(7)  
is a more numerically stable version of Eq. 6 since it computes the voxelwise, cost which scales well with the size of the dataset or batch. But the ratio of to is maintained as desired. (FP Rate Correction) is introduced to penalize the network for false predictions. However, we do not want to give false positive () and false negatives () the same weight as total predictions (,), since we will end up with a loss function without any class balancing because the weights will offset each other. Therefore, we introduce and , which depend on the mean absolute distance of the wrong predicted probabilities to 0.5 (the value can be changed to suit the task). This allows us to penalize false predictions, which are very far from the central point (0.5). The false predictions () are obtained through a 0.5 probability threshold. Experimental results from application of FP rate correction can be found in Section B.
IiiC Synthetic Data Generation and Preparation
To generate synthetic data, we follow the method of [7] which considers the mutual interplay of arterial oxygen () supply and vascular endothelial growth factor (VEGF) secreted by ischemic cells to achieve physiologically plausible results. Each vessel segment is modeled as a rigid cylindrical tube with radius and length . It is represented by a single directed edge connecting two nodes. Semantically, this gives rise to four different types of nodes, namely root, leaf, bifurcation, and inter nodes. Each node is uniquely identified by the 3D coordinate . Combining this with connectivity information, fully captures the geometry of the approximated vasculature. Radius of parent bifurcation branch , and the radius of left () and right () daughter branches are related by a bifurcation law (also known as Murray’s law) given by , where is the bifurcation exponent. Further constraints:
(8) 
are placed on the bifurcation angles of the left () and right () respectively, which geometrically corresponds to the optimal position of the branching point with respect to a minimum volume principle [7]. The tree generation model and the bifurcation configuration is shown in Fig. 5. In the arterial tree generation experiment, the parameters in Table 1 of [7] are used. We use the default (underlined) values for all model parameters and generate 136 volumes.
The output of the generation process is a tree with information on the 3D position of the nodes, their type (root, bifircation, inter, leaf), and connectivity information, which includes the edge between two nodes and , and its radius . We used this information to construct the actual 3D volume by modeling each vessel segment as a cylinder in 3D space. Vessel intensities are randomly chosen in the interval and nonvessel intensities are chosen from the interval . Gaussian noise is then applied to the generated volume randomly changing the mean (i.e. in the range
) and the standard deviation (i.e. in the range
) for each volume.IiiD Network Architecture and Implementations
We test the performance of DeepVesselNet discussed in sections IIIA, IIIB and IIIC through two main implementations:
DeepVesselNetFCN (DVNFCN)
We construct a Fully Convolutional Network FCN with four convolutional layers and a sigmoid classification layer. In this implementation, we do not use any downsampling layer and we carry out the convolutions in a way that the output image is of the same size as the input image by zeropadding. The removal of the downsampling layer is motivated by the fact that the tasks (vessel segmentation, centerline prediction, and bifurcation detection) involve fine detailed objects and downsampling has an effect of averaging over voxels which causes these fine details to be lost. With this network implementation, we have a very simple 5layer fullyconvolutional network, which takes a volume of arbitrary size and outputs a segmentation map of the same size. For the network structure and a description of the parameters, see Fig.
6.DeepVesselNetVNet (DVNVNet)
As a proof of principle, we take the VNet architecture proposed by [41] (see Fig. A.6 in the Appendix) and replace all 3D convolutions with our proposed crosshair filters discussed in section IIIA. The aim is to test the improvement in memory requirements as well as the speed up that our crosshair implementation provides over the original architecture proposed in [41]. We also use it to evaluate whether speed and memory consumption have a significant effect on prediction accuracy.
Network Configuration, Initialization, and Training
We use the above described architecture to implement three binary networks for vessel segmentation, centerline prediction, and bifurcation detection. Network parameters are randomly initialized, according to the method proposed in [48]
, by sampling from a uniform distribution in the interval
where is the size of the given kernel in a particular layer. For each volume, we extract nonoverlapping boxes of size () covering the whole volume and then feed them through the network for the finetuning of parameters. After this, we then train the network using a stochastic gradient descent without regularization. During pretraining, we use a learning rate of 0.01 and decay of 0.99, which is applied after every 200 iterations. For finetuning, we use a learning rate of 0.001 and a decay of 0.99 applied after every 200 iterations. We implement our algorithm using the THEANO
[49] Python framework and train on a machine with 64GB of RAM and Nvidia TITAN X 12GB GPU.Iv Experiments, Results and Discussion
Iva Datasets
In this work, we use three different datasets to train and test the networks. In all three data sets, the test cases are kept apart from the training data and are used only for testing purposes.
Synthetic Dataset
Training convolutional networks from scratch typically requires significant amounts of training data. However, assembling a properly labeled dataset of 3D curvilinear structures, such as vessels and vessel features, takes a lot of human effort and time, which turns out to be the bottleneck for most medical applications. To overcome this problem, we generate synthetic data based on the method proposed in [7]. A brief description of this process has already been presented in Section IIIC. We initialize the processes with different random seeds and scale the resulting vessel sizes in voxels to match the sizes of vessels in clinical datasets. After this, we add different levels of Gaussian noise, as mentioned earlier in Section IIIC to increase the randomness and to make it more realistic. We generate 136 volumes of size with corresponding labels for vessel segmentation, centerlines, and bifurcation detection. We then select twenty volumes out of the 136 as a test set for the pretraining phase and use the remaining volumes for pretraining in the various tasks at hand. An example of the synthetic dataset can be found in Fig. 7(c).
Clinical MRA TimeofFlight (TOF) Dataset
To finetune and test our network architectures on real data, we obtain 40 volumes of clinical TOF MRA of the human brain, 20 of which are fully annotated and the remaining 20 partially annotated using the method proposed by [23]. Each volume has a size of and spacial resolution of on the coronal, sagittal, and axial axes respectively. We select 15 out of the 20 fully annotated volumes for testing and use the remaining five as a validation set. We also correct the 20 partially annotated volumes by manually verifying some of the background and foreground voxels. This leads to three labels, which are true foreground (verified foreground), true background (verified background), and the third class, which represent the remaining voxels not verified. After this, we use the true foreground and background labels to finetune our network after pretraining with the synthetic dataset. This approach helps in avoiding any uncertainty with respect to using the partially annotated data for finetuning of the network. A sample of volume from the TOF MRA dataset can be found in Fig. 7(a).
Synchrotron Radiation Xray Tomographic Microscopy (SRXTM)
A 3D volume of size and spacial resolution is obtained from synchrotron radiation Xray tomographic microscopy of a rat brain. From this large volume, we extract a dataset of 20 nonoverlaping volumes of size , which were segmented using the method proposed by [26] and use them to finetune the network. To create a test set, we manually annotate 52 slices in 4 other volumes (208 slices in total). Detailed description of the SRXTM data can be found in [50], and a sample volume is presented in Fig. 7(b).
IvB Data Preprocessing with Intensity Projection
In our experiments, we use different datasets that come from different sources and acquisition modalities. Therefore, we need to normalize their intensity ranges and contrast. We test the following preprocessing strategies to achieve homogeneity between the datasets.
First, the original intensities are normalized to the range using
where is the pixel intensity and denotes the range of all intensities in the volume.
The second strategy involves clipping the intensity values by and then normalizing the intensities by . Our experiments show that a value of is optimal for the intensity range of the datasets. The final strategy builds on the second preprocessing strategy by clipping by , normalizing by , and then projecting the resulting intensities by the function . In our experiments, we test quadratic and cubic projections (i.e. and , respectively).
IvC Evaluating DeepVesselNet Performance
Prior to evaluating the performance of DeepVesselNet, we conducted a series of experiments to test the components of DeepVesselNet which includes the preprocessing strategies, fast crosshair filters, the FP rate correction, and pretraining on synthetic data. These experiments and their results are discussed in appendix B. In this subsection, we retain the best training strategy from the above described experiments and assess the performance of our proposed network architecture with other available methods mainly on the vessel segmentation task. As a further validation of our methodology we handle centerline prediction, and bifurcation detection using the proposed architectures. Given a good vessel segmentation, centerline prediction and bifurcation detection tasks can best be handled by applying vessel skeletonization as a post processing step. Our aim in applying our architecture to handle these tasks is not to show superiority over the existing vessel skeletonization methods but it is to serve as a further verification of the effects of our described methodology. We therefore discuss mainly results on vessel segmentation and briefly mention the results from centerline and bifurcation task. The full result and discussion of these experiments are therefore given in the appendix section C.
Experiment 1: Vessel Segmentation
We pretrain DeepVesselNet(FCN and VNet) architectures (from Figs. 6 and A.6) on synthetic volumes for vessel segmentation and evaluate its performance on TOF MRA volumes. We then finetune the networks with additional clinical TOF MRA data, repeating the evaluation. Table I reports results of these tests, together with performances of competing methods. We obtain a Dice score of 0.81 for DeepVesselNetFCN and 0.80 for DeepVesselNetVNet before, and 0.86 (DeepVesselNetFCN) as well as 0.84 (DeepVesselNetVNet) after fine tuning. Generating vessel segmentations takes less than 13s per volume using DeepVesselNetFCN and 20s for DeepVesselNetVNet. Table I also reports results from the methods of [26] (VNet) and [23]
both of which are outperformed by DeepVesselNetFCN both in terms of speed (execution time) and Dice score. Comparing DeepVesselNetVNet and original VNet on the MRA data, we find a small advantage for the latter in terms of Dice score which cannot be considered as significant (0.8425 and 0.8497 respectively with sample standard error of 0.0066 and Ttest significance probability of 0.1950). However, DeepVesselNetVNet has the advantage of being six seconds faster (about 23% improvement) during prediction, a time difference that will scale up with volume size and filter sizes, and our results show that crosshair filters can be used in DeepVesselNet at a little to no cost in terms of vessel segmentation accuracy.
Dataset  Method  Precision  Recall  Dice  Ex. time 
Synthetic  DVNFCN  99.84  99.87  99.86  13 s 
DVNVNet  99.54  99.59  99.56  17 s  
VNet  99.48  99.50  99.49  23 s  
Schneider et al.  99.47  99.56  99.52  n/a  
TOF MRA  DVNFCN (F)  86.44  86.93  86.68  13 s 
DVNFCN (P)  82.76  80.25  81.48  13 s  
DVNVNet (F)  85.00  83.51  84.25  20 s  
DVNVNet (P)  83.32  77.12  80.10  20 s  
VNet (F)  84.34  85.62  84.97  26 s  
VNet (P)  82.41  75.82  78.98  26 s  
Schneider et al.  84.81  82.15  83.46  min  
Forkert et al.  84.99  73.00  78.57  n/a  
SRXTM  DVNFCN  96.72  95.82  96.27  4 s 
DVNVNet  95.83  96.18  96.01  7 s  
VNet  95.25  95.84  95.55  11 s  
Schneider et al.  95.15  91.51  93.30  23 min  
Task  Method  Precision  Recall  Dice  Ex. time 
Centerline  DVNFCN  77.63  82.35  79.92  13 s 
VNet  76.41  73.30  74.82  23 s  
Bifurcation  DVNFCN  78.80  92.97  85.30  13 s 
Schneider et al.  77.18  85.08  80.94  n/a 
Experiment 2: Centerline Prediction
For centerline prediction, we train DeepVesselNet on the synthetic dataset and test it on synthetic as well as clinical datasets. The network uses the probabilistic segmentation masks from experiment 1 as an input (together with the same training and test splits described in experiment 1) and uses the vessel predictions to restrict predicted centerlines to be within vessel regions. We obtain a Dice score of 0.80 for DeepVesselNetFCN, outperforming all other methods by a margin of more than 5% in Dice score, these methods include Schneider et al [26] which is based on morphological operations for vessel skeletonization (the state of the art). DeepVesselNetFCN is also better than the competing methods in terms of execution time. Fig. 8 shows a qualitative example of centerline prediction with DeepVesselNetFCN. Full quantitative results are in Table V in the appendix.
Experiment 3: Bifurcation Detection
For a quantitative evaluation of DeepVesselNet in bifurcation detection, we use synthetically generated data, and adopt a twoinputchannels strategy. We use the vessel segmentations from experiments 1 as one input channel and the centerline predictions from Experiment 2 as a second input channel relying on the same training and test splits as in the previous experiments. Results from Table I and Fig. A.5 show that DeepVesselNetFCN performs better than the other architectures in 5 out of 6 metrics with a detection Dice score of 85% and again showing superiority in execution time. DeepVesselNetFCN again performs better than Schneider et al [26] which is based on first reconstructing the underlying network and then assigning bifurcation points based on node connectivity. Fig. 9 shows a qualitative example of bifurcation detection with DeepVesselNetFCN. Full qualitative results are in Table VI in the appendex.
V Summary and Conclusions
We present DeepVesselNet, an architecture tailored to the challenges of extracting vessel networks and network features using deep learning. Our experiments show that the crosshair filters, which is one of the components of DeepVesselNet, performs comparably well as 3D filters and, at the same time, improves significantly both speed and memory usage, easing an upscaling to larger data sets. Another component of DeepVesselNet, the introduction of new weights and the FP rate correction in the class balancing loss function helps in maintaining a good balance between precision and recall during training. This turns out to be crucial for preventing over and undersegmentation problems, which are common problems in vessel segmentation. Finally, we successfully demonstrated that transfer learning of DeepVesselNet through pretraining on synthetically generated data improves segmentation and detection results, especially in situations where obtaining manually annotated data is a challenge.
As future work, we will generalize DeepVesselNet to multiclass segmentation, handling vessel segmentation, centerline prediction, and bifurcation detection simultaneously, rather than in three subsequent binary tasks. We also expect that network architectures tailored to our three hierarchically nested classes will improve the performance of the DeepVesselNet, for example, in a single, but hierarchical approach starting from a base network for vessel segmentation, additional layers for centerline prediction, and a final set of layers for bifurcation detection. The current implementation (crosshair layers, the networks, cost function), and future extensions of DeepVesselNet, will be made publicly available on Github.
References

[1]
R. Rigamonti, A. Sironi, V. Lepetit, and P. Fua, “Learning separable
filters,” in
The IEEE Conf. on Computer Vision and Pattern Recognition (CVPR)
, June 2013.  [2] H. R. Roth, L. Lu, A. Seff, K. M. Cherry, J. Hoffman, S. Wang, J. Liu, E. Turkbey, and R. M. Summers, “A new 2.5d representation for lymph node detection using random sets of deep convolutional neural network observations,” in Med. Image Comput. and ComputerAssisted Intervention – MICCAI 2014. Springer Int. Publishing, 2014, pp. 520–527.
 [3] S. Liu, D. Zhang, Y. Song, H. Peng, and W. Cai, “Triplecrossing 2.5d convolutional neural network for detecting neuronal arbours in 3d microscopic images,” in Mach. Learn. in Med. Imaging. Springer Int. Publishing, 2017, pp. 185–193.
 [4] J. W. GrzymalaBusse, L. K. Goodwin, W. J. GrzymalaBusse, and X. Zheng, “An approach to imbalanced data sets based on changing rule strength,” in RoughNeural Computing: Techniques for Computing with Words, S. K. Pal, L. Polkowski, and A. Skowron, Eds. Berlin, Heidelberg: Springer, 2004, pp. 543–553.
 [5] P. F. Christ, F. Ettlinger, F. Grün, M. E. A. Elshaera, J. Lipkova, S. Schlecht, F. Ahmaddy, S. Tatavarty, M. Bickel, P. Bilic, M. Rempfler, F. Hofmann, M. D. Anastasi, S.A. Ahmadi, G. Kaissis, J. Holch, W. Sommer, R. Braren, V. Heinemann, and B. Menze, “Automatic liver and tumor segmentation of ct and mri volumes using cascaded fully convolutional neural networks,” ArXiv eprints, Feb. 2017.
 [6] G. Haixiang, L. Yijing, J. Shang, G. Mingyun, H. Yuanyue, and G. Bing, “Learning from classimbalanced data: Review of methods and applications,” Expert Systems with Applications, vol. 73, pp. 220 – 239, 2017.
 [7] M. Schneider, J. Reichold, B. Weber, G. Székely, and S. Hirsch, “Tissue metabolism driven arterial tree generation,” Med. Image Anal., pp. 1397–1414, 2012.
 [8] D. Szczerba and G. Székely, “Simulating vascular systems in arbitrary anatomies,” in Med. Image Comput. and ComputerAssisted Intervention – MICCAI: 8th Int. Conf., Palm Springs, CA, USA, October 2629, 2005, Proc.,. Berlin, Heidelberg: Springer, 2005, pp. 641–648.
 [9] C. Kirbas and F. Quek, “A review of vessel extraction techniques and algorithms.” ACM Comput. Surv, vol. 36, pp. 81–121, 2004.
 [10] D. Lesage, E. Angelini, I. Bloch, and G. FunkaLea, “A review of 3d vessel lumen segmentation techniques: models, features and extraction schemes.” Med. Image Anal., vol. 13(6), p. 819–845, 2009.
 [11] M. E. MartínezPérez, A. D. Hughes, A. V. Stanton, S. A. Thom, A. A. Bharath, and K. H. Parker, “Retinal blood vessel segmentation by means of scalespace analysis and region growing,” in Med. Image Comput. and ComputerAssisted Intervention – MICCAI. Berlin, Heidelberg: Springer, 1999, pp. 90–97.
 [12] D. Nain, A. Yezzi, and G. Turk, “Vessel segmentation using a shape driven flow,” in Med. Image Comput. and ComputerAssisted Intervention – MICCAI. Berlin, Heidelberg: Springer, 2004, pp. 51–59.
 [13] A. C. S. Chung and J. A. Noble, “Statistical 3d vessel segmentation using a rician distribution,” in Med. Image Comput. and ComputerAssisted Intervention – MICCAI: Second Int. Conf., Cambridge, UK, Proc., C. Taylor and A. Colchester, Eds. Berlin, Heidelberg: Springer, 1999, pp. 82–89.
 [14] W. Liao, K. Rohr, and S. Wörz, “Globally optimal curvatureregularized fast marching for vessel segmentation,” in Med. Image Comput. and ComputerAssisted Intervention – MICCAI. Berlin, Heidelberg: Springer, 2013, pp. 550–557.
 [15] R. Moreno, C. Wang, and Ö. Smedby, “Vessel wall segmentation using implicit models and total curvature penalizers,” in Image Anal.: 18th Scandinavian Conf., Proc. Berlin, Heidelberg: Springer, 2013, pp. 299–308.
 [16] S. Young, V. Pekar, and J. Weese, “Vessel segmentation for visualization of mra with blood pool contrast agent,” in Med. Image Comput. and ComputerAssisted Intervention – MICCAI. Berlin, Heidelberg: Springer, 2001, pp. 491–498.
 [17] A. Dalca, G. Danagoulian, R. Kikinis, E. Schmidt, and P. Golland, “Segmentation of nerve bundles and ganglia in spine mri using particle filters,” in Med. Image Comput. and ComputerAssisted Intervention – MICCAI: 14th Int. Conf., Toronto, Canada, Proc. Berlin, Heidelberg: Springer, 2011, pp. 537–545.
 [18] C. Florin, N. Paragios, and J. Williams, “Globally optimal active contours, sequential monte carlo and online learning for vessel segmentation,” in Computer Vision – ECCV 2006: 9th European Conf. on Computer Vision, Graz, Austria, Proc. Berlin, Heidelberg: Springer, 2006, pp. 476–489.
 [19] S. Wörz, W. J. Godinez, and K. Rohr, “Probabilistic tracking and modelbased segmentation of 3d tubular structures,” in Bildverarbeitung für die Medizin 2009: Algorithmen — Systeme — Anwendungen Proc. des Workshops. Berlin, Heidelberg: Springer, 2009, pp. 41–45.
 [20] S. Wang, B. Peplinski, L. Lu, W. Zhang, J. Liu, Z. Wei, and R. M. Summers, “Sequential monte carlo tracking for marginal artery segmentation on ct angiography by multiple cue fusion,” in Med. Image Comput. and ComputerAssisted Intervention – MICCAI: 16th Int. Conf., Nagoya, Japan, Proc. Berlin, Heidelberg: Springer, 2013, pp. 518–525.
 [21] A. F. Frangi, W. J. Niessen, K. L. Vincken, and M. A. Viergever, “Multiscale vessel enhancement filtering,” in Med. Image Comput. and ComputerAssisted Intervention — MICCAI, W. M. Wells, A. Colchester, and S. Delp, Eds. Berlin, Heidelberg: Springer, 1998, pp. 130–137.
 [22] M. W. K. Law and A. C. S. Chung, “Three dimensional curvilinear structure detection using optimally oriented flux,” in Computer Vision – ECCV 2008, D. Forsyth, P. Torr, and A. Zisserman, Eds. Berlin, Heidelberg: Springer, 2008, pp. 368–382.
 [23] N. D. Forkert, A. SchmidtRichberg, J. Fiehler, T. Illies, D. Möller, D. Säring, H. Handels, and J. Ehrhardt, “3d cerebrovascular segmentation combining fuzzy vessel enhancement and levelsets with anisotropic energy weights,” Magn. Reson. Imaging, vol. 31, pp. 262–271, 2013.
 [24] N. D. Forkert, A. SchmidtRichberg, J. Fiehler, T. Illies, D. Möller, H. Handels, and D. Säring, “Fuzzybased vascular structure enhancement in timeofflight mra images for improved segmentation,” Methods of Information Medicine, vol. 50, pp. 74–83, 2011.
 [25] R. Phellan and N. D. Forkert, “Comparison of vessel enhancement algorithms applied to timeofflight mra images for cerebrovascular segmentation,” Med. Physics, vol. 44, no. 11, pp. 5901–5915, 2017.
 [26] M. Schneider, S. Hirsch, B. Weber, G. Székely, and B. H. Menze, “Joint 3d vessel segmentation and centerline extraction using oblique hough forests with steerable filters,” Med. Image Anal., vol. 19, no. 1, pp. 220–249, 2015.
 [27] D. C. Ciresan, A. Giusti, L. M. Gambardella, and J. Schmidhuber, “Deep neural networks segment neuronal membranes.” electron microscopy images. In NIPS, p. 2852–2860, 2012.
 [28] R. Phellan, A. Peixinho, A. Falcão, and N. D. Forkert, “Vascular segmentation in tof mra images of the brain using a deep convolutional neural network,” in Intravascular Imaging and Computer Assisted Stenting, and LargeScale Annotation of Biomedical Data and Expert Label Synthesis. Springer Int. Publishing, 2017, pp. 39–46.
 [29] B. Shagufta, S. A. Khan, A. Hassan, and A. Rashid, “Blood vessel segmentation and centerline extraction based on multilayered thresholding in ct images,” Proc. of the 2nd Int. Conf. on Intelligent Systems and Image Processing, pp. 428–432, 2014.
 [30] M. Maddah, A. Afzalikhusha, and H. Soltanian, “Snake modeling and distance transform approach to vascular center line extraction and quantification,” Computerized Med. Imag. and Graphics, vol. 27 (6), pp. 503–512, 2003.
 [31] D. Chen and L. D. Cohen, “Piecewise geodesics for vessel centerline extraction and boundary delineation with application to retina segmentation,” in Scale Space and Variational Methods in Computer Vision: 5th Int. Conf., SSVM 2015, LègeCap Ferret, France, May 31  June 4, 2015, Proc. Springer Int. Publishing, 2015, pp. 270–281.
 [32] A. SantamaríaPang, C. M. Colbert, P. Saggau, and I. A. Kakadiaris, “Automatic centerline extraction of irregular tubular structures using probability volumes from multiphoton imaging,” in Med. Image Comput. and ComputerAssisted Intervention – MICCAI 2007: 10th Int. Conf., Brisbane, Australia, October 29  November 2, 2007, Proc., Part II. Berlin, Heidelberg: Springer, 2007, pp. 486–494.
 [33] Y. Zheng, J. Shen, H. Tek, and G. FunkaLea, “Modeldriven centerline extraction for severely occluded major coronary arteries,” in Mach. Learn. in Medical Imaging: Third Int. Workshop, MLMI 2012, Held in Conjunction with MICCAI: Nice, France, Revised Selected Papers. Berlin, Heidelberg: Springer, 2012, pp. 10–18.
 [34] M. M. G. Macedo, C. Mekkaoui, and M. P. Jackowski, “Vessel centerline tracking in cta and mra images using hough transform,” in Progress in Pattern Recognition, Image Anal., Computer Vision, and Applications: 15th Iberoamerican Congress on Pattern Recognition, CIARP 2010, Sao Paulo, Brazil, November 811, 2010. Proc. Berlin, Heidelberg: Springer, 2010, pp. 295–302.
 [35] M. Rempfler, M. Schneider, G. D. Ielacqua, X. Xiao, S. R. Stock, J. Klohs, G. Székely, B. Andres, and B. H. Menze, “Reconstructing cerebrovascular networks under local physiological constraints by integer programming,” Med. Image Anal., pp. 86–94, 2015.
 [36] T. Chaichana, Z. Sun, M. BarrettBaxendale, and A. Nagar, “Automatic location of blood vessel bifurcations in digital eye fundus images,” in Proc. of Sixth Int. Conf. on Soft Computing for Problem Solving: SocProS 2016, Volume 2. Singapore: Springer Singapore, 2017, pp. 332–342.
 [37] Y. Zheng, D. Liu, B. Georgescu, H. Nguyen, and D. Comaniciu, “3d deep learning for efficient and robust landmark detection in volumetric data,” in Med. Image Comput. and ComputerAssisted Intervention – MICCAI, N. Navab, J. Hornegger, W. M. Wells, and A. Frangi, Eds. Springer Int. Publishing, 2015, pp. 565–572.
 [38] J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional networks for semantic segmentation,” CoRR, abs/1411.4038, 2014.
 [39] O. Ronneberger, P. Fischer, and T. Brox, “Unet: Convolutional networks for biomedical image segmentation.” Med. Image Comput. and ComputerAssisted Intervention (MICCAI), vol. 9351, pp. 234–241, 2015.
 [40] K.K. Maninis, J. PontTuset, P. Arbeláez, and L. Van Gool, “Deep retinal image understanding,” in Med. Image Comput. and ComputerAssisted Intervention – MICCAI 2016: 19th Int. Conf., Athens, Greece, October 1721, 2016, Proc., Part II. Springer Int. Publishing, 2016, pp. 140–148.
 [41] F. Milletari, N. Navab, and S. Ahmadi, “Vnet: Fully convolutional neural networks for volumetric med. image segmentation,” in Fourth Int. Conf. on 3D Vision (3DV), 2016, pp. 565–571.
 [42] I. Nogues, L. Lu, X. Wang, H. Roth, G. Bertasius, N. Lay, J. Shi, Y. Tsehay, and R. M. Summers, “Automatic lymph node cluster segmentation using holisticallynested neural networks and structured optimization in ct images,” in Med. Image Comput. and ComputerAssisted Intervention – MICCAI 2016: 19th Int. Conf., Athens, Greece, October 1721, 2016, Proc., Part II. Springer Int. Publishing, 2016, pp. 388–397.
 [43] H. R. Roth, L. Lu, A. Farag, A. Sohn, and R. M. Summers, “Spatial aggregation of holisticallynested networks for automated pancreas segmentation,” in Med. Image Comput. and ComputerAssisted Intervention – MICCAI 2016: 19th Int. Conf., Athens, Greece, October 1721, 2016, Proc., Part II. Springer Int. Publishing, 2016, pp. 451–459.
 [44] A. Sekuboyina, A. Valentinitsch, J. Kirschke, and B. H. Menze, “A localisationsegmentation approach for multilabel annotation of lumbar vertebrae using deep nets,” arXiv, vol. 1703.04347, 2017.

[45]
G. Tetteh, M. Rempfler, C. Zimmer, and B. H. Menze, “Deepfext: Deep feature extraction for vessel segmentation and centerline prediction,” in
Mach. Learn. in Med. Imaging. Springer Int. Publishing, 2017, pp. 344–352.  [46] J.J. Hwang and T.L. Liu, “Pixelwise deep learning for contour detection,” in ICLR, 2015.
 [47] S. Xie and Z. Tu, “Holisticallynested edge detection,” in Proc. of the IEEE Int. Conf. on computer vision, 2015, pp. 1395–1403.

[48]
Y. Bengio and X. Glorot, “Understanding the difficulty of training deep
feedforward neuralnetworks,” in
Proc. of the 13th Int. Conf. on Artificial Intelligence and Statistics (AISTATS)
, vol. 9, 2010.  [49] Theano Development Team, “Theano: A Python framework for fast computation of mathematical expressions,” arXiv eprints, vol. abs/1605.02688, May 2016.
 [50] J. Reichold, M. Stampanoni, A. L. Keller, A. Buck, P. Jenny, and B. Weber, “Vascular graph model to simulate the cerebral blood flow in realistic vascular networks,” Journal of Cerebral Blood Flow & Metabolism, vol. 29, no. 8, pp. 1429–1443, 2009.
Appendix A Effect of Preprocessing Strategies
Appendix B Evaluating the DeepVesselNet Components
Experiment 1: Testing Preprocessing Strategies
In a first experiment we test the different preprocessing strategies proposed in Section IVB. We carry out these experiments considering the synthetic dataset as ideal. Therefore, we train DeepVesselNetFCN solely on our synthetic data for vessel segmentation and then use this trained network to segment vessels in the clinical MRA dataset after applying each of the four preprocessing strategies discussed in Section IVB. This enables us to measure how similar our clinical MRA dataset is to the synthetic dataset after the preprocessing. Table II shows the outcome of these experiments. From the results, we observe that the intensity projections (quadratic and cubic) improve the results (a lot in terms of Dice score). Quadratic intensity projection performs slightly better than the cubic intensity projection. Therefore, we use the quadratic intensity projection in all following experiments.
Preprocessing method  Precision  Recall  Dice 

Original Intensities  0.0112  1.0  0.0221 
Clipped Intensities  0.3117  0.9379  0.4679 
Quadratic Projection  0.8276  0.8025  0.8149 
Cubic Projection  0.8995  0.7033  0.7894 
Experiment 2: Fast Crosshair Filters
To investigate the usefulness of the 3D context information extracted by the crosshair filters of DeepVesselNet, we construct a similar network as in Fig. 6 (DeepVesselNetFCN) but replace the crosshair filters by 2D filters. We train this 2.5D network on 2D slices (as described in Section IIIA) and then fuse the final slicewise results to obtain the final volumetric predictions. We perform this experiment using the synthetic dataset for centerline prediction with the training and test splits discussed in Section IVA and measure the performance of the two networks in terms of execution time per volume (Ex. time) and in terms of accuracy by the Dice score. Results are reported in Table III. We observe a benefit of 2.5D network in terms of run time (6s vs. 13s) when compared to the 3D network but at a cost of accuracy in terms of Dice Score (0.70 vs. 0.80), which can be explained by lack of 3D context information.
Method  Precision  Recall  Dice  Ex. time 

3D (Crosshair filters)  0.7763  0.8235  0.7992  13 s 
2.5D (2D on slices)  0.6580  0.7472  0.7017  6 s 
Experiment 3: The FP Rate Correction Loss Function ()
To test the effect of FP rate correction loss function discussed in section IIIB, we train DeepVesselNetFCN architecture on a subsample of four clinical MRA volumes from scratch, with and without FP rate correction described in Eq. (7). We train for 5000 iterations and record the ratio of precision to recall every 5 iterations using a threshold of 0.5 on the probability maps. A plot of the precision  recall ratio during training without FP rate correction ( Only) and with FP rate correction () is presented in Fig. A.2. The results of this experiments suggest that training with both factors in the loss function, as proposed in Section IIIB, keeps a better balance between precision and recall (i.e. a ratio closer to 1.0) than without the second factor. A balanced precisionrecall ratio implies that the training process is not bias towards the background or the foreground. This helps prevent oversegmentation, which is caused by the introduction of the class balancing.
Experiment 4: Pretraining on Synthetic Data
We assess the usefulness of transfer learning with synthetic data by comparing the training convergence speed, and various other scores that we obtain when we pretrain DeepVesselNetFCN on synthetic data and finetune on the clinical MRA dataset, compared to training DeepVesselNetFCN from scratch on the clinical MRA. For this experiment, we only consider the vessel segmentation task, as no annotated clinical data is available for centerline and bifurcation tasks. Results of this experiment are reported in Table IV. We achieve a Dice score of 0.8639 for training from scratch without pretraining on synthetic data and 0.8668 when pretraining on synthetic data. This shows that training from scratch or pretraining on synthetic data does not make a big difference regarding the accuracy of the results. However, training from scratch requires about iterations more than pretraining on synthetic data for the network to converge (i.e. 50% more longer).
Method  Precision  Recall  Dice  Iterations 

With pretraining  0.8644  0.8693  0.8668  1200 
Without pretraining  0.8587  0.8692  0.8639  1800 
Appendix C Full Results and Discussion on Centerline and Bifurcation Detection tasks
Experiment 2: Centerline Prediction
For centerline prediction, we train DeepVesselNet on the synthetic dataset and test it on synthetic as well as clinical datasets. The network uses the probabilistic segmentation masks from experiment 1 as an input (together with the same training and test splits described in experiment 1) and uses the vessel predictions to restrict predicted centerlines to be within vessel regions. Qualitative results are presented in Figs. 8 and A.3 together with quantitative scores in Table V. We obtain a Dice score of 0.80 for DeepVesselNetFCN, outperforming all other methods by a margin of more than 5%. Here we note that DeepVesselNetFCN shows a higher precision (0.78 vs 0.48) than the mmorphological operations based method of Schneider et al. [26] which is one of the standard method for centerline prediction. The results also show that DeepVesselNetFCN is better than the competing methods in terms of execution time. Again, DeepVesselNetVNet performs slightly worse than VNet in terms of the Dice score (0.67 vs. 0.75). However, DeepVesselNetVNet has an advantage in speed of execution (17s vs. 23s) as expected.
Method  Precision  Recall  Dice  Ex. Time 

DVNFCN  0.7763  0.8235  0.7992  13 s 
DVNVNet  0.6515  0.6887  0.6696  17 s 
VNet  0.7641  0.7330  0.7482  23 s 
Schneider et al.  0.4807  0.8603  0.6168  n/a 
Experiment 3: Bifurcation Detection
For a quantitative evaluation of DeepVesselNet in bifurcation detection, we use synthetically generated data, and adopt a twoinputchannels strategy. We use the vessel segmentations from experiments 1 as one input channel and the centerline predictions from Experiment 2 as a second input channel relying on the same training and test splits as in the previous experiments. In our predictions we aim at localizing a cubic region of size around the bifurcation points, which are contained within the vessel segmentation. We evaluate the results based on a hitormiss criterion: a bifurcation point in the ground truth is counted as hit if a region of a cube of size centered on this point overlaps with the prediction, and counted as a miss otherwise; a hit is considered as true positive (TP) and a miss is considered as false negative (FN); a positive label in the prediction is counted as false positive (FP) if a cube of size centered on this point contains no bifurcation point in the ground truth. Qualitative results on synthetic and clinical MRA TOF are shown in Figs. 9 and A.4, respectively. Results for [26] are obtained by first extracting the vessel network and then all nodes with two or more splits are treated as bifurcations – this being one of the standard methods for bifurcation extraction. In Fig. A.5, we present the precisionrecall curve obtained by varying the threshold for converting probability maps, from the networks, into binary predictions. Results from Table VI and Fig. A.5 show that DeepVesselNetFCN performs better than the other architectures in 5 out of 6 metrics. In our experiments, it became evident that VNet tends to overfit, possibly suffering from its high number of parameters to be determined from the rather few bifurcations present in our training data. This may explain why results for VNet are worse than all other methods, also suggesting that in cases where little training data is available, the DeepVesselNetFCN architecture may be the preferable.
Method  Precision  Recall  Det. %  Mean Err  Err Std  Ex. Time 

DVNFCN  0.7880  0.9297  86.87  0.2090  0.6671  13 s 
DVNVNet  0.4680  0.5670  84.21  1.6533  0.9645  17 s 
VNet  0.2550  0.6871  70.29  1.2434  1.3857  23 s 
Schneider et al.  0.7718  0.8508  84.30  0.1529  0.7074  n/a 
Comments
There are no comments yet.