1 Introduction
Segmentation of anatomical structures is an important step in many computer aided procedures, like medical image navigation and detection algorithms. Many of this methods rely on manually segmented inputs performed by clinical experts. However, this is a time consuming task due to the large amount of information (generally volumes) that is generated. Organ segmentation in CT or MRI slices has been a topic of research for many years. Recently, with the growth of deep learning models, many architectures have been proposed for dealing with this problem. Some of the challenges for this models are related to the similarity between organs and background. This leads to misclassifications, mainly in boundary regions of the organs, that generates false positives (FP) and false negatives (FN) regions in the final results. Due to this, they might not be enough for clinical integration, where higher precision is required. One way to improve model performance is by introducing a postprocessing refinement step in the pipeline. Even though dense graph representations of three dimensional data have been applied for refinement
[1], the use of recent graph convolutional networks (GCN) with sparse graphs representations of 3D data has not been fully investigated. In this paper, we propose a twostep approach for refinement of volumetric segmentation coming from a convolutional neural network (CNN). First, we perform a uncertainty analysis by applying Monte Carlo dropout (MCDO)
[9] to the network to obtain the model’s uncertainty. This is used to divide the CNN output in high confidence background, high confidence foreground and low confidence points (FP and FN candidates). The uncertainty is also used to define a 3D shapeadapted region of interest (ROI) around the organ. With this information, we define a semilabeled graph inside the ROI that then is used to train a GCN in a semisupervised way using the high confidence nodes. The refined segmentation is obtained by evaluating the full graph in the trained GCN. Additionally, we compare our framework with the refinement resulting from a fully connected conditional random field (CRF) inference [2]. Our main contributions can be summarized as follows: We present a methodology to define a semilabeled graph representation for 3D medical images; A refinement strategy that can be added to any CNN model (through MCDO). To our best knowledge, this is one of the first works employing a GCNbased refinement for CNN segmentation in medical data and the first work combining uncertaintybased misclassified point proposal with graphlike representations and GCN for the refinement of organ segmentation in volumetric data. We validate our method in the segmentation of pancreas. The results are compared with CRF.Related Work
Recent segmentation models for medical structures are based on fully convolutional neural networks (FCN). These models can be composed of aggregations of multiple 2D FCN [3, 4] or by 3D FCN [5, 6]. Refinement strategies are typically added at the end of the process to improve the results. This can also be used as an intermediate processing step, where more complex strategies can use the refined results to improve the segmentation. For example, in [7]
, a set of scribbles is generated by defining a CRF problem that is solved with a Graph Cuts methods. This results, can be combined with user defined scribbles to perform an image specific finetune of a CNN segmentor. In other context, given the limited availability of labeled medical data, semisupervised learning methods define strategies to include the (most commonly) available unlabeled medical data. Such strategies include the generation of pseudolabels for unlabeled data. Here, refinement methods, like densely connected CRF
[11] are included in the semisupervised steps, to refine the pseudolabels. Uncertainty has also proved to be useful as an attention mechanism in semisupervised learning [12]and recent works in computer vision have started to explore the capabilities of uncertainty for finding potential misclassified regions for segmentation refinement purposes
[10]. In the medical context, uncertainty has been employed as a measure of quality for the segmented output [13], and its ability to reflect incorrect predictions has been recently studied [14]. This motivates the development and research of new refinement strategies based on uncertaintydriven misclassification proposal.2 Methods
In this section we describe the process employed to refine the segmentation. First, we perform an uncertainty analysis on the CNN. Then, we use this information to define a semisupervised GCN learning problem for segmentation refinement.
2.1 Uncertainty Analysis.
In order to define FP/FN candidates, we estimate the uncertainty of the CNN using the MCDO strategy presented in
[9]. For this, we use the dropout layers of the network in inference time, and perform stochastic passes on the network. Then, the model’s expectation is obtained using the following equation:(1) 
with the number of MCDO passes, a CNN model (slicewise or volumetric), the input data, and the model parameters after applying dropout in the pass . The misclassified candidates are based on the entropy level of the CNN, computed as:
(2) 
with
the probability map of
for class , and in our binary segmentation scenario. We use as an approximation of the probability for computing the entropy. Since MCDO can be computationally expensive due to the multiple evaluations, we reduce the volume to the smallest cube containing the the biggest 3D connected component in the CNN prediction . Finding this connected component, by itself, can increases the overall dice score of . We perform all the uncertainty analysis considering this smaller area.2.2 Graph Definition.
The graph is constructed considering the set of volumes (see Fig. 1). We will use the notation to define the value of a particular volume at voxel . We aim to obtain a refined segmentation using a graph based approach:
(3) 
with a GCN, a semilabeled graph, and a set of model parameters.
Since most of the voxels in the volume are irrelevant for the refinement process, we restrict the refinement to a shapeadapted ROI surrounding the uncertainty region. Given that graphs are not restricted to rectangular structured representation of data, we can use shapearbitrary ROIs adjusted to our working area. The ROI is defined as , with and
the binarized expectation and binarized entropy respectively. Note that this last gives us the FN/FP voxels candidates. The expectation is thresholded by 0.5 and the entropy by a parameter
. The voxels inside are used to define the nodes ofand each node is represented by a feature vector containing intensity
, expectation , and prediction . Edges are generated as follows: for a particular voxel (the black square in Fig 1b) we create a connection to its six perpendicular neighbors and also to 16 randomly selected voxels inside the ROI (the blue squares in Fig 1b). This allows the definition of a sparse graph representation, where efficient filtering operations are implemented as a product of sparse matrix [16]. To define the weights for the edges, we tested a function based on Gaussian kernels considering the intensity and the 3D position associated with the node:(4) 
where is a balancing factor, div is given by the diversity between the nodes [17], defined as with , and for our binary case. We go for an additive weighting instead of a multiplicative. This because the GCN can take advantage of connections with both similar and dissimilar nodes in the learning process, and using a multiplicative weighting could cut dissimilar connections. Additive weighting will just assign a lower weight. Finally, we labeled each node in the graph according to its uncertainty level using the next rule:
(5) 
In this way, we have defined a semisupervised graph’s node classification problem that is solved with the methods presented in [16].
3 Experiments and Results
3.1 Implementation Details
We evaluate our framework on a 2D UNet [15]
trained for pancreas segmentation with the diceloss. In order to compute the uncertainty, we include a dropout layer after every convolutional layer and trained for 100 epochs with the Adam optimizer and a learning rate of
. We also include batch normalization after every convolutional layer, to bring stability to the network. We used the publicly available NIH
[18, 19, 20] pancreas dataset^{1}^{1}1https://wiki.cancerimagingarchive.net/display/Public/PancreasCTfor training and testing. We use 53 volumes for training and 20 volumes for testing. Nine volumes were not included in the experiments, since they appear to come from a different distribution. The GCN is conformed by two layers: a hidden layer with 32 feature maps and a output layer with two logits. The CGN is trained with Adam for 200 epochs with a learning rate of
, at evaluation time and independently for each volume. Theparameters for the weighting were set to the variance of their respective arguments and we use
. For the CRF refinement we use an implementation of [2], with as unary potential and the pairwise potential defined in terms of position and intensity (similar to the smoothness and appearance kernels defined in [2]). The number of MCDO samples was set to . We tried with different uncertainty thresholds, ranging from to .3.2 Results
Model  Average DSC.  Std. Dev.  Max. DSC.  Min. DSC 

UNet  76.00 %  6.35 %  85.56 %  61.91 % 
UNet connected comp.  77.04 %  7.86 %  88.21 %  59.15 % 
GCN ()  78.04 %  7.29 %  88.05 %  61.88 % 
GCN ()  77.92 %  7.43 %  88.10 %  61.36 % 
GCN ()  77.93 %  7.41 %  87.90 %  61.32 % 
GCN ()  77.96 %  7.34 %  87.90 %  61.21 % 
GCN ()  77.82 %  7.38 %  88.04 %  60.85 % 
CRF  77.84 %  8.33 %  88.21 %  53.74 % 
Table 1 shows the results for the UNet segmentation before and after finding the largest connected component (UNet connected comp.), together with the dice score for the GCN strategy using different uncertainty thresholds, and the CRF performance. Results shows better improvement in the dice score when using the GCN based refinement, specially when a threshold is used. These results were achieved in a sparse graph representation, showing that more efficient connectivities strategies can be applied, instead of fully connected representations, and that GCNs can make use of this sparse representations. Visual results are presented in Fig. 2. Rows 1 and 2 shows how graph based method can use connectivity relationships to recover missing regions. However in row 3, we can see a lost in connectivity. This happens because both models include the CNN expectation in their definition. This causes the disconnections when the expectation has strong differences with respect to the real segmentation. However, in this cases, the GCN shows more robustness to this problem and keeps part of the voxels, compared with the CRF method. In this context, different weighting methodologies can be investigated, in order to avoid disconnection.
4 Conclusion
In this work we have presented a method to construct a sparse semilabeled graph representation of volumetric medical data, based on the output and uncertainty analysis of a CNN segmentation. We have also shown that GCN learning strategies can be used on this graph to obtain a refined segmentation. Future research can be directed in definitions of connectivity, weighting, and node representation.
References
 [1] Kamnitsas, K., Ledig, C., et al.: Efficient MultiScale 3D CNN with fully connected CRF for Accurate Brain Lesion Segmentation. Medical Image Analysis. (2016)
 [2] Krähenbühl, P., and Koltun,V.: Efficient inference in fully connected CRFs with Gaussian edge potentials. In: Proceedings of the 24th International Conference on Neural Information Processing Systems (NIPS). (2011) 109117
 [3] Zhou, Y., Xie, L., et al.: A FixedPoint Model for Pancreas Segmentation in Abdominal CT Scans. In: Medical Image Computing and Computer Assisted Intervention (MICCAI). (2017) 193701
 [4] Roth, H., Lu, L., Lay, N., et al.: Spatial Aggregation of HolisticallyNested Convolutional Neural Networks for Automated Pancreas Localization and Segmentation. Medical Image Analysis. (2018) 94107
 [5] Zhu, Z., Xia, Y., et al.: A 3D CoarsetoFine Framework for Volumetric Medical Image Segmentation. In: 2018 International Conference on 3D Vision (3DV). (2018) 682690
 [6] Roth, H.,Oda, M., et al.: Towards dense volumetric pancreas segmentation in CT using 3D fully convolutional networks. Proc. SPIE 10574, Medical Imaging 2018: Image Processing. (2018)
 [7] Wang, G., Li, W., Zuluaga, M. A., et al.: Interactive Medical Image Segmentation Using Deep Learning With ImageSpecific Fine Tuning. In: IEEE Transactions on Medical Imaging. (2018)
 [8] Yu, L., Cheng, JZ., et al.: Automatic 3D Cardiovascular MR Segmentation with DenselyConnected Volumetric ConvNets. In: Medical Image Computing and Computer Assisted Intervention (MICCAI). (2017) 287295
 [9] Kendall, A., and Gal, Y.: What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision?. In: Proceedings of the 31th International Conference on Neural Information Processing Systems (NIPS). (2017)
 [10] Dias, P. A., and Medeiros, H.: Semantic Segmentation Refinement by Monte Carlo Region Growing of High Confidence Detections. In: Asian Conference on Computer Vision (ACCV). (2019)
 [11] Bai, W., Oktay, O., et al.: Semisupervised Learning for NetworkBased Cardiac MR Image Segmentation. In: Medical Image Computing and Computer Assisted Intervention (MICCAI). (2017) 253260
 [12] Xia, Y., Liu, F., Yang, D., et al.: 3D SemiSupervised Learning with UncertaintyAware MultiView CoTraining. (2018) arXiv:1811.12506
 [13] Guha Roy, A., Conjeti, S., Navab, N., and Wachinger, C.: Inherent Brain Segmentation Quality Control from Fully ConvNet Monte Carlo Sampling. In: Medical Image Computing and Computer Assisted Intervention (MICCAI). (2018)
 [14] Nair, T., Precup, D., Arnold, D. L., and Arbel, T.: Exploring Uncertainty Measures in Deep Networks for Multiple Sclerosis Lesion Detection and Segmentation. In: Medical Image Computing and Computer Assisted Intervention (MICCAI). (2018)
 [15] Ronneberger, O., Fischer, P., and Brox, T.: UNet: Convolutional Networks for Biomedical Image Segmentation. In: Medical Image Computing and Computer Assisted Intervention (MICCAI). (2015) 234241
 [16] Kipf, T. N. and Welling, M.: SemiSupervised Classification with Graph Convolutional Networks . In: International Conference on Learning Representations (ICLR). (2017)

[17]
Zhou, Z., Shin, J, et al.: Finetuning Convolutional Neural Networks for Biomedical Image Analysis: Actively and Incrementally. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). (2017)
 [18] Roth, H. R., Farag, A., et al.: Data From PancreasCT. The Cancer Imaging Archive. (2016) http://doi.org/10.7937/K9/TCIA.2016.tNB1kqBU
 [19] Roth, H. R., Lu, L., et al.: DeepOrgan: Multilevel Deep Convolutional Networks for Automated Pancreas Segmentation. In: Medical Image Computing and Computer Assisted Intervention (MICCAI). (2015)
 [20] Clark, K., Vendt, B., et al.: The Cancer Imaging Archive (TCIA): Maintaining and Operating a Public Information Repository, Journal of Digital Imaging, Volume 26, Number 6, December, 2013, pp 10451057