Segmentation of Defective Skulls from CT Data for Tissue Modelling

by   Oldřich Kodym, et al.
Brno University of Technology

In this work we present a method of automatic segmentation of defective skulls for custom cranial implant design and 3D printing purposes. Since such tissue models are usually required in patient cases with complex anatomical defects and variety of external objects present in the acquired data, most deep learning-based approaches fall short because it is not possible to create a sufficient training dataset that would encompass the spectrum of all possible structures. Because CNN segmentation experiments in this application domain have been so far limited to simple patch-based CNN architectures, we first show how the usage of the encoder-decoder architecture can substantially improve the segmentation accuracy. Then, we show how the number of segmentation artifacts, which usually require manual corrections, can be further reduced by adding a boundary term to CNN training and by globally optimizing the segmentation with graph-cut. Finally, we show that using the proposed method, 3D segmentation accurate enough for clinical application can be achieved with 2D CNN architectures as well as their 3D counterparts.


page 3

page 4

page 8


Multipath CNN with alpha matte inference for knee tissue segmentation from MRI

Precise segmentation of knee tissues from magnetic resonance imaging (MR...

TotalSegmentator: robust segmentation of 104 anatomical structures in CT images

In this work we focus on automatic segmentation of multiple anatomical s...

Semantic segmentation of multispectral photoacoustic images using deep learning

Photoacoustic imaging has the potential to revolutionise healthcare due ...

Unsupervised Tissue Segmentation via Deep Constrained Gaussian Network

Tissue segmentation is the mainstay of pathological examination, whereas...

Segmentation of Shoulder Muscle MRI Using a New Region and Edge based Deep Auto-Encoder

Automatic segmentation of shoulder muscle MRI is challenging due to the ...

A Time Series Graph Cut Image Segmentation Scheme for Liver Tumors

Tumor detection in biomedical imaging is a time-consuming process for me...

Polyp Detection and Segmentation using Mask R-CNN: Does a Deeper Feature Extractor CNN Always Perform Better?

Automatic polyp detection and segmentation are highly desirable for colo...

1 Introduction

Computer-assisted pre-surgical planning using generated 3D tissue models is seeing increasing use in personalized medicine [14]. In the context of craniofacial surgery, the applications range from patient education, diagnosis and operative planning [1] to patient-specific implant design [3], mostly in the cranial area. The latter had been accelerated by the advent of additive manufacturing (AM), also known as 3D printing in recent years [2]. A typical workflow of producing a pre-surgical 3D tissue model consists of data acquisition, converting the data into patient model and optionally printing the model. Computed tomography (CT) is usually the modality of choice because of its unparalleled hard tissue contrast required for precise model shape extraction. As the manufacturing process is usually able to produce the model with a satisfactory precision, converting the raw CT data into an accurate patient model remains the most crucial step [4].

Precise segmentation of the patient skull is therefore critical. Although simple global thresholding followed by laborious post-processing and cleaning remains the most commonly used method in medical AM [5], numerous semi- or fully automatic methods have been proposed for skull segmentation. Cuadros et al. [12] used super-voxels followed by clustering and the level-set method has been applied to new-born skull segmentation in CT by Ghadimi et al. [13]. Following the success of the convolutional neural networks (CNN) in biomedical segmentation for both 2D [7] and 3D [8, 9] settings, Minnema et al. used a simple patch-wise CNN for segmentation of skulls with defects for AM [11]. However, so far none of these methods have been able to show evidence that they are robust enough to be implemented into medical practice.

In this work, we propose an improved segmentation method that extracts region and boundary potentials using CNN and then uses graph-cut for globally optimal segmentation. The method outperforms methods based on conventional deep learning and other state-of-the-art methods of skull segmentation, and it produces results acceptable for the targeted use of 3D tissue modelling in the clinical practice. Furthermore, we directly compare 2D and 3D CNNs for segmentation and demonstrate that the benefit of using the 3D approach is not unequivocal.

Figure 1: Example renders of segmented skulls with the distance to the ground-truth surface in mm coded in color. Multi-view CNN segmentation outputs (top) and multi-view CutCNN segmentation outputs (bottom) are shown. To better display the differences, voxels with surface error of less than are left dark blue.

2 Proposed Method

We use the well known U-net model [7] as a baseline method for our segmentation experiments. We experimented with both multi-view (MV) ensemble of 3 orthogonal 2D U-nets as used in [10] and fully 3D U-net [8]

since to authors’ best knowledge, the current literature lacks direct comparison between the two approaches. The applied U-net slightly differs from the original architecture by using batch normalization and padding during convolutions, replacing the up-conv layers with bilinear up-sampling and reducing the initial number of convolutions to 16. The architecture of the 3D model is identical except that each convolution, max-pooling, and up-sampling operation is replaced by its 3D equivalent. The networks are trained until convergence using mini batches of shape

in case of 2D and

in case of 3D model using the Dice loss function


To improve segmentation performance on slightly out-of-distribution data (such as previously unseen medical material or defect shapes), we opted to apply 3D graph-cut segmentation on the CNN output. While this approach has been taken by other authors before [16]

, we also modify our CNN model to output an edge probability for each voxel in addition to the object probability. Thus, the final layer of the CNN has 3 channels instead of the standard 2. Figure 

2 illustrates how this step can provide additional boundary information to the graph-cut in comparison to simply using the conventional intensity or probability gradient. Another advantage of this approach is that since both region and boundary terms have similar dynamic range, finding optimal parameter of the graph-cut algorithm is simplified. We leave throughout our experiments.

We train the network using the following form of the Dice loss function:


where and are the probabilities of voxel belonging to class background and object respectively, and and are the corresponding ground-truth labels. Analogously, and are the probability and the ground-truth label of voxel belonging to the object edge. Edge map ground truth is obtained by subtracting the binary object from its morphologically dilated version, leaving a surface with single voxel thickness. Note that edge voxels overlap with the background voxels and the edge probability channel is therefore not included in the final softmax activation layer of the CNN.

Figure 2: Example CNN output slice, from left to right: Data, object probability map, edge probability map. Notice the segmentation error caused by an external object with density similar to that of the skull in upper left. The error is correctly separated by its detected edge.

Next, the output maps are converted into a 6-connected graph structure with the region terms for voxel given by


and the boundary term between neighbouring voxels and given by


Finally, globally optimal 3D segmentation can be obtained by finding minimum cut through this graph [6]. This method will be referred to as CutCNN in the remaining parts of the paper. Note that while the CNN can be either MV (multi-view) or 3D, the graph-cut segmentation is always 3D. The method is summarized in Figure 3.

Figure 3: Scheme of the proposed segmentation framework. Input data (a) are processed by a CNN model (b) to produce a probability map (c) and an edge strength map (d). These provide the boundary and region term for the graph-cut optimization step (e) which produces the binary output segmentation (f).

3 Experiments

In this section, we present the skull tissue dataset on which the segmentation methods were evaluated. Then, we present the results of different segmentation methods on the dataset.

3.1 Dataset

Head CT scans of 199 different patients were available for this study. The scans were acquired for the purpose of patient skull modelling and its additive manufacturing or further patient-specific implant design. Therefore, pixel-wise ground-truth segmentation done by an experienced radiologist were also available for model training. The scans were acquired on multiple CT scanners using a variety of different acquisition protocols. The voxel size varied from  mm to mm.

As the majority of these scans were acquired prior to a surgery, the skulls often contained various defects, fixation materials and other external objects. This makes fully automatic segmentation of these scans a challenging task, because many of these structures were only present in a single patient scan, making generalization difficult.

3.2 Metrics

Multiple metrics were used to quantitatively compare outputs of different segmentation methods used in the study. Inspired by the MICCAI 2018 Medical Segmentation Decathlon challenge [17], volumetric Dice coefficient and surface Dice coefficient were chosen. Furthermore, mean surface distance has been also included in the metrics as this is the recommended measure in area of medical tissue model preparation [5]. Implementations of the metrics used in this work are publicly available111

Dice coefficient (DC) is a well-known metric in medical segmentation domain. Given a number of true positive (TP) samples, false positive (FP) samples and false negative (FN) samples, the coefficient is given by


In case of volumetric Dice coefficient, number of voxels assigned an object label in output segmentation as well as in the ground-truth segmentation is used to compute TP while FP + FN correspond to the number of voxels assigned a different label.

To compute a surface Dice coefficient, the output and the ground-truth binary segmentation volumes are converted to polygon meshes. Each surface element in the output segmentation mesh is then considered a TP sample if the distance to the closest point on ground-truth surface is lower than threshold and vice-versa. The surface elements in output and ground-truth meshes that do not fall under this threshold are counted as FN and FP, respectively. We chose the threshold to correspond to the voxel size in our experiment.

3.3 Experimental Design and Results

Performance of four different models has been evaluated in this study. Both 3D and MV CNN models and their CutCNN counterparts have been implemented in the TensorFlow framework. PyMaxflow library has been used for implementation of the graph-cut optimization. All experiments were run on a desktop system equipped with Nvidia Titan Xp GPU, an i5 intel core processor and 16GB RAM.

22 scans were randomly selected as test subjects for the experiment, leaving 177 skulls for model training. Using convolutional kernels of size 3 in all the CNN models results in the 3D model having the same number of trainable parameters as the sum of the three orthogonal 2D models. The comparison between the MV ensemble and the 3D approach can therefore be considered an ablation study to an extent. CutCNN models also have a similar number of parameters, the only difference being the final edge probability output layer. Quantitative comparison of results of each method are presented in Figure 4 and Table 1. Further qualitative results are shown in Figure 5 and 1.

Figure 4: Accuracy of standard multi-view (MV) and 3D CNN and their CutCNN counterparts. Results shown in terms of mean surface distance (MSD), volumetric Dice coefficient (VDC) and surface Dice coefficient (SDC).
MV CNN 0.37 96.7 97.1
3D CNN 0.35 96.7 97.0
MV CutCNN 0.31 97.7 98.3
3D CutCNN 0.32 98.0 98.1
* Minnema et al. [11] 0.44 92.0 -
* Linares et al. [12] - 91.5 -
  • * Results obtained on different datasets

Table 1: Comparison of segmentation methods using mean surface distance (MSD) [mm], volumetric Dice coefficient (VDC) and surface Dice coefficient (SDC).
Figure 5: Qualitative results shown for several chosen axial slices. From top to bottom: Multi-view CNN output (red), ground-truth (magenta), multi-view CutCNN output (blue).

4 Discussion

CutCNN segmentation framework resulted in a performance gain in all cases in terms of every metric used in the experiment over standard CNN approaches. The output of CNN object probability map often contains errors near external objects or smaller tissue defects as these are scarce in the training data distribution. However, the graph-cut optimization guides the resulting binary segmentation towards a spatially consistent and compact shape, often eliminating these artifacts if a detected edge corresponds mostly to the correct object boundary. This effect is further illustrated in Fig. 1.

Our second observation is that using 3D convolutional kernels has a rather small effect on the final segmentation precision quantitatively compared to the MV approach. However, although the quantitative difference is small, for applications in medical additive manufacturing, it is important to avoid ragged segmentation output which may result from MV CNN in areas of lower model certainty. These include for example teeth, which are challenging to detect, especially when the lower and upper teeth are in contact (see Figure 5 a), or maxillary sinus, which is often enclosed in order to improve mechanical stability of the manufactured model (see Figure 5 b). Therefore, 3D U-nets are often considered necessary to avoid these discontinuities caused by slice-by-slice processing.

However, this artifact can also be addressed by employing the CutCNN framework since ragged segmentation boundary introduces a high boundary-term penalization during optimization and it is therefore avoided in the final binary segmentation. Thus, employing CutCNN allows the decision between 3D or multi-view approach to be merely a technical choice. Using 2D models can offer some advantages, such as faster training of deeper models with less overfitting [10].

We also evaluate the performance of the proposed method in the context of existing related work in skull segmentation. In terms of volumetric Dice coefficient, the proposed method achieved performance of in the multi-view scenario and in the 3D scenario. This result is considerably higher than that of reported by Minnema et al. [11]. This is probably caused by several limiting factors in the latter, including the small training set that only allowed for a smaller CNN architecture and employing a patch-based approach. To our best knowledge, the presented work is the first to apply a fully automatic segmentation approach to a pathological skull dataset of this size. Furthermore, we also achieve a low mean surface distance with the proposed method, namely mm. Preliminary testing of the proposed method by experts in medical tissue modelling practice showed that the results are accurate enough to substantially reduce the amount of time spent by creating the model in practice when compared to currently used semi-automatic segmentation methods.

5 Conclusions

In this work, we presented CutCNN, an improved hard tissue segmentation method which integrates the CNN output with graph-cut segmentation. The results of such a method surpassed the commonly used CNN architectures such as 3D and multi-view U-nets as well as other competitive methods in the skull segmentation domain. The object and edge probability maps in combination with graph cut provide a compact and smooth final tissue segmentation while adding very little computational cost. This method could therefore be used to improve the performance of any semantic segmentation task given that the edges are well defined in the data. In the future, to deal with any remaining segmentation errors, user interaction can be introduced to the method on both CNN and graph-cut level as the output of both steps can be improved through user scribbles in an iterative fashion. This will further reduce the time spent producing accurate tissue model.

Acknowledgements. This work was supported in part by the company TESCAN 3DIM. We also gratefully acknowledge the support of the NVIDIA Corporation with the donation of one NVIDIA TITAN Xp GPU for this research.


  • [1] D’Urso, P. et al.: Stereolithographic biomodelling in cranio-maxillofacial surgery: a prospective trial. Journal of Cranio-Maxillofacial Surgery, 27(1), pp.30-37. (1999)
  • [2] Mitsouras, D. et al.: Medical 3D Printing for the Radiologist. RadioGraphics. 35, 7, 1965-1988 (2015).
  • [3] Jardini, A. et al.: Cranial reconstruction: 3D biomodel and custom-built implant created using additive manufacturing. Journal of Cranio-Maxillofacial Surgery. 42, 8, 1877-1884 (2014).
  • [4] Martelli, N. et al.: Advantages and disadvantages of 3-dimensional printing in surgery: A systematic review. Surgery. 159, 6, 1485-1500 (2016).
  • [5] van Eijnatten, M. et al.: CT image segmentation methods for bone used in medical additive manufacturing. Medical Engineering & Physics. 51, 6-16 (2018).
  • [6]

    Boykov, Y., Jolly, M.: Interactive graph cuts for optimal boundary & region segmentation of objects in N-D images. Proceedings of Eighth IEEE International Conference on Computer Vision. (ICCV). (2001).

  • [7] Ronneberger, O. et al.: U-Net: Convolutional Networks for Biomedical Image Segmentation. Lecture Notes in Computer Science. 234-241 (2015).
  • [8] Çiçek, Ö. et al.: 3D U-Net: Learning Dense Volumetric Segmentation from Sparse Annotation. Medical Image Computing and Computer-Assisted Intervention – MICCAI 2016. 424-432 (2016).
  • [9] Milletari, F. et al.: V-Net: Fully Convolutional Neural Networks for Volumetric Medical Image Segmentation. 2016 Fourth International Conference on 3D Vision (3DV). (2016).
  • [10] Chen, Y. et al: Hippocampus segmentation through multi-view ensemble ConvNets. Proceedings of IEEE 14th International Symposium on Biomedical Imaging (ISBI). (2017).
  • [11] Minnema, J. et al.: CT image segmentation of bone for medical additive manufacturing using a convolutional neural network. Computers in Biology and Medicine. 103, 130-139 (2018).
  • [12] Cuadros Linares, O. et al.: Mandible and skull segmentation in cone beam computed tomography using super-voxels and graph clustering. The Visual Computer. (2018).
  • [13] Ghadimi, S. et al.: Skull Segmentation and Reconstruction From Newborn CT Images Using Coupled Level Sets. IEEE Journal of Biomedical and Health Informatics. 20, 2, 563-573 (2016).
  • [14] Zille, D., et al.: The Evolution of Surgical Planning in Orthognathic Surgery:. EC Dental Science 17.11: 1914-1919 (2018)
  • [15]

    Chen, L. et al.: Semantic Image Segmentation with Task-Specific Edge Detection Using CNNs and a Discriminatively Trained Domain Transform. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016).

  • [16] Lu, F., Wu, F. et al.: Automatic 3D liver location and segmentation via convolutional neural network and graph cut. International Journal of Computer Assisted Radiology and Surgery (2017).
  • [17] Cardoso, M., J. et al.: Medical Segmentation Decathlon Workshop, Medical Image Computing and Computer-Assisted Intervention (MICCAI) (2018).