1 Introduction
Biological structures to support medical diagnosis, surgical planning and treatments. Based on fully convolutional networks (FCN) and UNet [31, 26]
, deep convolutional networks (DNNs) have made significant improvemnents in biomedical image segmentation. Due to the high efficiency and capability to automatically capture information without handdesigned features, deep learning methods have dominated biomedical image analysis. Due to the segmentation abnormalities and histological variations, a higher level of pixelwise prediction in biomedical image analysis is required than in natural images. In particular, a marginal bias in biomedical segmentation will result in high false clinical treatment. Therefore, the improvement of segmentation remains boosting attention. Recent works such as UNet which applied skip connections to combine feature maps from the current layer with higher layer feature maps and proved a competitive performance in maintaining finegrained information. In the meantime, segmentation masks are generated with contextual details even if the background composition is rather complicated. We divide it into two categories.
1) intrablock dense connections which embeds the dense block to the traditional convolutional block such as FDUNet[15]. In addition, cascaded of stacked UNets also gain enough attention. CUNet[9] perform dense connections of the same level among multiple UNets. However, these works fail to consider transforming the size of feature maps. As a consequence, they are substantially different from our work. 2) Interblock dense connections. Which means current layer can fuses from previous layer with differnet scale. For instance, MIMONet[30] takes input image of different scales in the encoder unit. However, the feature maps are not actually reused. UNet++[45] fuses higher resolution feature maps in the decoder unit but it involves a massively computational costs due to the large number of intermediate convolutions. In UNet ++, the current layer can only fuse the feature maps from higher layers.Inspired by DenseNet[18], in order to improve segmentation accuracy, we directly downsample features from lower layers and perform upsampling functions for higher layers to the same resolution of the current layer and fuses them with feature maps from the current layer. We use 1*1 conv twice to control the number of channels the same as before. The whole operation involves in a small constant number of extra parameters. As far as we are concerned, we are the first to explore directly fusing deep semantic and coarsegrained feature maps from higher layers and lowlevel, finegrained feature maps from lower layers. The modified fusing operation contains more object information and pixel information, and therefore improves the segmentation in UNet architecture. We also systematically analyze the impact of different kinds of densely connected structure. The experiment shows that fusing higher and lower layer’s feature maps simultaneously turns out more effective and achieves a higher precision.
The contribution of our work is 1) conducting complete experiment and analysis on the influence on UNet with multiscale dense connections systematically. 2) we adopt the optimal model based on the experiment and propose a novel Multiscale Dense UNet (MDUNet) architecture with quantization. The proposed model achieves the superiority over UNet performance by up to 3% on testA and 4.1% on testB.
2 Related Work
In this section, we introduce late approaches towards UNet architecture, dense connections, multiscale representation, network quantization and biomedical image segmentation methods.
2.1 UNet architecture
Models are designed as encoderdecoder architectures to retrieve high resolution from low resolution representations of the image. [31] initially proposed the Ushape network architecture with direct skip connection between the encoder and decoder. systematically analyzed and proved the importance of long skip connection in UNet for biomedical image segmentation. Other than image segmentation, a variety of tasks involves in UNet based architecture. Stacked UNets[34]
iteratively fuse multiscale features without changing the resolutions. To deal with human pose estimation tasks,
[27, 38, 41] stacked modified UNets which captured both the topdown and bottomup features as a whole. [33][12]follow the grid pattern in the Ushape structure. In a more general manner, [24] additionally employed multipath refinement and global convolutional blocks respectively between the encoder and decoder. The classification and localization problems are solved simultaneously during the successive downsampling and upsampling operation in UNet. Furthermore, we conduct experiments in detail on the impact of UNet architecture with various dense connections.2.2 Dense connections
Recently, the exploration on both the depth and the width of the network architecture has been a focused study. Approaches toward wider network begin with [36, 37]
which introduced ‘Inception Module’ by concatenating feature maps to approximate sparse structure. Moreover, residual network
[17, 19]alleviated the vanishing gradient problem by summing up a shortcut connection with the residual function. Recent methods such as PSPNet
[43] and Refinenet [24] applied residual architecture more frequently as feature extractor in dense prediction tasks. [11]combined UNet with residual network and proved skip connection effective in qiomedical image segmentation. Additionally, to improve the representational power without increasing the depth and width of the network, [18] proposed a typical structure of dense connections. In a dense block, each output of the convolution unit contributes to all the subsequent units as input through concatenation. With substantially fewer parameters, the network enables feature reuse and better gradient flow and therefore yields extremely competitive results. In FCDenseNet [21], they extended the DenseNet [18] by replacing each convolutional block in the downsampling path of FCN with dense block which they referred as transition up module to deal with semantic segmentation problems. [40] further improved dense decoder blocks with featurelevel longrange skip connections. With the cascaded architecture of singlepass, the network obtained surprising results with fewer computational costs on multiscale works. The compact structure of dense connections integrates shortcut connection, feature reuse and implicit deep supervision while exhibiting no extra difficulties of optimization. Apart from directly adding dense connections in convolutional blocks, [1] composed a denser scale sampling and denser pixel sampling in an atrous spatial pyramid pooling module [4]. Dense connections proved extraordinarily effective in biomedical image processing due to the limited amount of data. [15] incorporated dense connectivity [22]within the encoder and decoder path. To address the spatial information of 3D input data,[23] used 2DDenseUnet as intraslice feature extractor along with hybrid feature fusion module to formulate an endtoend learning. Inspired by the previous literature, we generalize the dense connections to extend feature fusion and contextual information of various scales between the encoder and decoder.2.3 Multiscale Representation
Approaches towards the application of encoding the multiscale context information are widely explored. Other than the encoderdecoder structure discussed before, the construction of image pyramid [3, 5, 25] is frequently used so that various scales of objects are obtained in the network. Dilated or atrous convolution [3, 4, 42] deployed in parallel or cascaded expands the receptive fields while exhibiting no extra parameters. Further, ASPP [4] modified the atrous convolution in parallel within spatial pyramid pooling to efficiently capture features of an arbitrary scale. In particular, DenseASPP [1] stacks ASPP module in a denser manner. Beyond atrous convolution, deformable convolution [7] generalize the atrous convolution by boosting the spatial sampling locations.
2.4 Network Quantization
Usually, the increasing scale of the network results in high consumption of computational resources and relatively difficult optimization. Quantization techniques for training deep neural networks are gaining growing attention and recent approaches
[16, 6, 29]have succeeded in reducing the scale of the network by means of cropping precision operations and operands. Incremental quantization[44] compresses the parameters to the powers of two or zero by iteratively weightpartition, groupwise quantization and retraining. The pruninginspired strategy forms the quantized parameters as a weak model and compensates the loss of precision by retraining the remaining parameters. Quantization of the network improves the generalization of the network and the robustness to potential overfitting at the cost of subtle loss of precision.2.5 Biomedical Image Segmentation
Previously, handcrafted features containing morphological information are designed and traditional graphbased models are frequently used[20, 28, 35, 13]. However, malignant subjects vary seriously in appearance and they are beyond the capacity of traditional methods. Therefore, deep learning methods have dominated biomedical image processing in recent years[8, 10, 32], especially in histological section analysis[2, 32]Ṫo relieve effort of manual annotation, Suggestive Annotation[esa]
combined fully convolutional network with active learning to select hard examples for further annotation.
[2, 14]modified loss functions and achieved promising results for Gland Segmentation. In addition, MIMONet
[44] deals with the variation of intense cell boundaries and sizes by exploiting multiinputs and multioutputs in the network. To this end, we propose a simple yet effective multiscale connectivity pattern for biomedical image segmentation.3 Method
In this section, firstly, we introduce three multiscale dense connected blocks in encoder, decoder and across encoder and decoder. The overall combining three multiscale dense connected blocks architecture of our network. As illustrated in Figure 2. Also, we compare the proposed blocks with UNet in detail. Secondly, we decribe the implementation of quantization in proposed model, which reduce overfitting in model.
3.1 Dense Encoder and Dncoder Block
Our improvements is based on Unet’s. Let’s briefly look back at the basic structure of Unet. A traditional encoder unit can be defined as the left of Figure 3. and is the input and output of current layer, is the output of after downsample. Eq 1 and Eq 2 describe the process.
Our method is use instead of , which is denfined as Eq 4. We use encoder the feature maps , which are Adjusted to the same size as from pervious layer In to layer I2. fuses two feature maps and . represents the concatenation operation and conv. The describe of n above refers to the number of current layer fuses oredered pervious layer feature maps. the influence about the dense connected number n will be discuss in section 5.2.
Specifically, each convolutional block is composed of two repeated cascaded structure of a conv
, all of them follows by a batch normalization and a ReLU activation function. Figure 1 is sample of dense connected decoder unit which
. Dense decoder block is similar to the dense encoder block, we won’t repeat it.There are some meaning multiscale dense connected different about above in encoder or decoder. Such as the multiinput (Min) and multioutput (Mout), as shown in Eq 5 and Eq 6. In Min dense connected unit, each layer only fuses the feature maps from input with downsampling to the corresponding size, meanwhile in Mout dense connected unit only the last layer fuses all the feature map from pervious layer with upsampling to the corresponding size.
3.2 Dense Cross connections Block
In this Section, we also start from the traditional Unet cross connections. As shown in figure 4, a traditional cross connections unit be defined as Eq 7, Eq 8 and Eq 9. and is the input and output of current layer, is the feature map in encoder corresponding to . is the output of after upsample. encoder the feature maps from layer I1 in encoder and the output from pervious layer in decoder after upsampling.
our method is use instead of , which is denfined as Eq 11. encoder two groups of feature maps from higher ordered encoder layer I+1 to I+n and lower ordered encoder layer In to I1. fuses two feature maps and . represents the same opeartion. which adjust the number of channels the same as .
There are some meaning dense connected different about that. Such as the Upper and Lower, as shown in Eq 12 and Eq 13. In Upper dense connected unit, each layer in decoder can only fuses the feature from Upper layer in encoder, meanwhile in Lower dense connected unit can only fuses the feature from Upper layer in encoder,
3.3 Fully Multiscale Dense connected Ushape architecture
In this section, we introduce the fully dense connected Ushape architecture based on UNet. As illustrated in Figure 2, the improved structure of encoder is identical to Section 3.1. The decoding structure is the combination of multiscale dense cross connections and multiscale dense decoder. The detailed information follows Eq 7, 8, 9 in Section 3.2. The variants and operations share the same description with Section 3.2.
FMDUNet encodes the dense cross connections, dense connected decoder with feature maps from corresponding feature maps of different scales in encoder and the feature maps from previous layers in decoding blocks respectively. We reencode the information obtained from the first encoding operation. The encoded feature maps share the same number of channels with the original one.
3.4 Network Quantization
As the increasing scale of the network results in high consumption of computational resources and relatively difficult optimization, we adopt Incremental Quantization (INQ) to compress the parameters as a regularization function against potential overfitting. We integrate the results of multiple networks as the final result. The number of parallel model is referring to [39]. INQ quantizes the parameters to the power of two or zero which makes shift operation possible. As shown in Eq 18 where the is the original weights and is the quantized, u and l refer to upper and lower bound. iteratively, half of the weights are quantized and set fixed, and the network is then finetuned endtoend until all the parameters are quantized. We experiment different bits of 3, 5 and 7 to ruduce overfitting of dense connections in section 4.4.
4 Experiments
To evaluate the proposed model thoroughly, we applied the Gland Segmentation (GlaS) dataset ,a biomedical image datasets, in Histology Image Challenge held at MICCAI 2015. It contains 165 images with 16 HE stained histological sections colon cancer. 85 images (37 benign and 48 malignant) are selected as training set while 80 images (37 benign and 43 malignant) are used for testing. To be specific, all test images were separated into two categories.(60 Test Part A and 20 Test Part B) We train our proposed endtoend network with backpropagation on two NVIDIA GeForce GTX TITAN X, each contains 12 GB of memory. We set the learning rate to 0.005 in the beginning, and divides by 10 every time the iteration reaches a threshold. SGD optimization algorithm and a batch size of 4 is set during the training time.The optimal model is selected based on the performance on both training sets. Additionally, we conduct experiments on dense connections of various sizes and shapes. For dense encoder and dense decoder block, we compare the number of connections from 1 to 4 and two special cases (Min and Mout) mentioned before. For dense cross block, due to the limited depth of the network, we examine the effectiveness only on
and connections. Besides, the performance of quantization is evaluated independently.As illustrated in Figure 5, we compare the train loss of original UNet, single dense connected model and combination dense connected model based on UNet in first four hundred epochs. We can see that after the one hundred epochs, our proposed models are more stable than original UNet. Which proves our conclusion that dense connections improve the information flow encoder, decoder and across them. Multiscale dense connections, and achieve a higher precision. We also compare the output of our proposed various dense connected model based on UNet with original UNet. The performance of our proposed models are better than that of original UNet. In the next subsections, we will discuss the effect of the number of dense connections in single and combination model based on UNet. From which we find that too much dense may led to overfitting, while improving accuracy. We also discuss the impact of our method, model quantization, for reducing the over fitting.
4.1 Discussion on the number of dense connections
In this section, we explore the influence on each dense structure (dense encoder block, dense decoder block, dense cross connections) as number of connections varies in detail. As shown in Table 1, 2 and 3, each structure is followed by the corresponding number of connections. Concluded from the experiment, obviously, the accuracy generally gets higher as the number of dense connections increases. The result indicates dense connections including the encoded object information from higher layers and pixel information from lower layers improve the feature reuse and thus gain a promising segmentation accuracy. On MICCAI 2015 Gland Dataset, the modification of certain structure obtains an accuracy of 91.8% on Test A and 87.1% on Test B which achieves a superiority by 2% on average over UNet.
Method  mean IoU  Dice Coefficient  

A  B  A  B  
0.797  0.738  0.886  0.853  
0.841  0.753  0.906  0.862  
0.852  0.771  0.915  0.871  
0.856  0.772  0.918  0.869  
0.859  0.779  0.919  0.877  
0.861  0.778  0.919  0.872 
Method  mean IoU  Dice Coefficient  

A  B  A  B  
0.797  0.738  0.886  0.853  
0.841  0.759  0.908  0.861  
0.852  0.768  0.915  0.866  
0.857  0.770  0.917  0.870  
0.860  0.784  0.919  0.877  
0.861  0.784  0.920  0.870 
Method  mean IoU  Dice Coefficient  

A  B  A  B  
0.797  0.738  0.886  0.853  
0.852  0.762  0.917  0.866  
0.855  0.766  0.918  0.870  
0.857  0.770  0.916  0.868  
0.861  0.778  0.920  0.872 
Method  mean IoU  Dice Coefficient  

Part A  Part B  Part A  Part B  
Unet  0.797  0.738  0.902  0.842 
0.853  0.764  0.916  0.864  
0.859  0.770  0.918  0.870  
0.863  0.768  0.920  0.871  
0.866  0.764  0.925  0.857 
4.2 Discussion on the Combination of three Dense connections
In this section, we investigate the impact of combining three different dense connected blocks. We have reached a conclusion before that the increasing number of dense connections results in a better performance of the model. We select as the basic component, indicating feature maps in each encoding block contribute to four subsequent blocks and is chosen as the same manner. Note that we set connections consisting of two upper connections from subsequent layers, two lower connections from previous layers and the direct skip connection as UNet. We systematically conduct the experiment of combining two or three basic components. The result is shown in Table 4. Obviously, in TestA, either the combination of two or three achieves a reasonable improvement. However, in TestB, the performance drops compared with the single model. We believe the decreased accuracy is caused by the potential overfitting as the distribution of train dataset and test set A are approximately closer. In the next section, we attempt to explore quantization methods to reduce the overfitting.
4.3 Discussion on network efficiency
Apart from assessing the accuracy of segmentation, we evaluate the number of parameters of the network. Recent methods based on UNet appear wider, deeper and more complicated to optimize and deep supervision turns out an efficient trick for auxiliary training. In contrast, even the extremely dense structure we proposed increases a tiny number of parameters compared with UNet. Due to the reuse of feature maps and concatenation operation, no extra computations and parameters are involved except for the 1*1 convolution. Table 1 demonstrate the comparison of the number of parameters of several excellent methods. We achieve the stateoftheart accuracy while exhibiting ignorable increment of parameters, which reveals the high efficiency of our proposed model. On the other hand, our proposed model reveals a valuable extendibility and can be treated as a novel backbone rather than UNet for Ushape based networks.
Method  parameter number 

UNet  8M 
U + dense encoder block  8M + 0.005M 
U + dense decoder block  8M + 0.005M 
U + dense cross connections  8M + 0.005M 
MDUnet^{*}  8M + 0.015M 
Unet++  8M + 1M 
MILDnet  8M + 68M 
MIMOnet  8M + 166M 

MDUNet means that the framework contains three dense connections based on UNet
Method  mean IoU  Dice Coefficient  

Part A  Part B  Part A  Part B  
MDUNet  0.866  0.764  0.925  0.857 
^{*}  0.871  0.784  0.925  0.873 
0.866  0.790  0.923  0.876  
0.859  0.791  0.918  0.865  
0.872  0.772  0.928  0.878  
0.865  0.786  0.922  0.876  
0.857  0.750  0.916  0.881  
0.867  0.776  0.919  0.871  
0.862  0.772  0.925  0.870  
0.859  0.768  0.922  0.878  

The subscript 1/2 means that 1/2 parameters of the model are quantized
4.4 Discussion on network quantization
In this section, we explore quantization methods to improve the performance of our proposed network. In particular, Incremented Quantization is applied to quantize the parameters in order to reduce the overfitting problem instead of model compression. We analyze the quantized models of different degrees because completely quantizing on all the parameters leads to a reduction on segmentation accuracy. As stated in Table 6, the overfitting problem is largely reduced after the first quantization operation in which half of the parameters are quantized. Hence, the performance on Test set B is improved as expected while the prediction accuracy on Test set A remains. The generalization ability of the model is enhanced compared with the overall quantized model. We gain an surprisingly competitive accuracy of 0.88 on test B. In balance, we adopt the halfquantized architecture as our final model.
5 conclusion
In this paper, we propose three different multiscale dense connections for U shaped architecture’s encoder, decoder and across them. Our architecture is directly fuses the neighboring different scale feature maps from both higher layers and lower layers to strengthen feature propagation in current layer. Which can largely improves the information flow encoder, decoder and across them. And next, we explore the effects of them in detail based on UNet. Concluded from the experiment, obviously, the accuracy generally gets higher as the number of dense connections increases. We adopt the optimal model based on the experiment and propose a novel MDUNet combining three dense connected architecture with quantization. which reduce the overfitting from dense connections. Finally, our model achieves the superiority dice coefficient over UNet by up to 3% on testA and 4.1% on testB.
References

[1]
P. Bilinski and V. Prisacariu.
Dense decoder shortcut connections for singlepass semantic
segmentation.
In
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
, pages 6596–6605, 2018.  [2] H. Chen, X. Qi, L. Yu, and P.A. Heng. Dcan: deep contouraware networks for accurate gland segmentation. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pages 2487–2496, 2016.
 [3] L.C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE transactions on pattern analysis and machine intelligence, 40(4):834–848, 2018.
 [4] L. C. Chen, G. Papandreou, F. Schroff, and H. Adam. Rethinking atrous convolution for semantic image segmentation. 2017.
 [5] L. C. Chen, Y. Yang, J. Wang, W. Xu, and A. L. Yuille. Attention to scale: Scaleaware semantic image segmentation. In Computer Vision and Pattern Recognition, pages 3640–3649, 2016.
 [6] M. Courbariaux, I. Hubara, D. Soudry, R. ElYaniv, and Y. Bengio. Binarized neural networks: Training deep neural networks with weights and activations constrained to +1 or 1. 2016.
 [7] J. Dai, H. Qi, Y. Xiong, Y. Li, G. Zhang, H. Hu, and Y. Wei. Deformable convolutional networks. pages 764–773, 2017.
 [8] N. Dhungel, G. Carneiro, and A. P. Bradley. Deep learning and structured prediction for the segmentation of mass in mammograms. In International Conference on Medical Image Computing and ComputerAssisted Intervention, pages 605–612. Springer, 2015.
 [9] L. Dong, L. He, M. Mao, G. Kong, X. Wu, Q. Zhang, X. Cao, and E. Izquierdo. Cunet: a compact unsupervised network for image classification. IEEE Transactions on Multimedia, 20(8):2012–2021, 2018.

[10]
Q. Dou, H. Chen, L. Yu, L. Zhao, J. Qin, D. Wang, V. C. Mok, L. Shi, and P.A.
Heng.
Automatic detection of cerebral microbleeds from mr images via 3d convolutional neural networks.
IEEE transactions on medical imaging, 35(5):1182–1195, 2016.  [11] M. Drozdzal, E. Vorontsov, G. Chartrand, S. Kadoury, and C. Pal. The importance of skip connections in biomedical image segmentation. pages 179–187, 2016.
 [12] D. Fourure, R. Emonet, E. Fromont, D. Muselet, A. Tremeau, and C. Wolf. Residual convdeconv grid network for semantic segmentation. arXiv preprint arXiv:1707.07958, 2017.
 [13] H. Fu, G. Qiu, J. Shu, and M. Ilyas. A novel polar space random field model for the detection of glandular structures. IEEE transactions on medical imaging, 33(3):764–776, 2014.
 [14] S. Graham, H. Chen, Q. Dou, P.A. Heng, and N. Rajpoot. Mildnet: Minimal information loss dilated network for gland instance segmentation in colon histology images. arXiv preprint arXiv:1806.01963, 2018.
 [15] S. Guan, A. Khan, S. Sikdar, and P. V. Chitnis. Fully dense unet for 2d sparse photoacoustic tomography artifact removal, 2018.
 [16] S. Han, H. Mao, and W. J. Dally. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. Fiber, 56(4):3–7, 2015.
 [17] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. pages 770–778, 2015.
 [18] G. Huang, Z. Liu, L. V. D. Maaten, and K. Q. Weinberger. Densely connected convolutional networks. In IEEE Conference on Computer Vision and Pattern Recognition, pages 2261–2269, 2017.
 [19] G. Huang, Y. Sun, Z. Liu, D. Sedra, and K. Q. Weinberger. Deep networks with stochastic depth. pages 646–661, 2016.

[20]
J. G. Jacobs, E. Panagiotaki, and D. C. Alexander.
Gleason grading of prostate tumours with maxmargin conditional
random fields.
In
International Workshop on Machine Learning in Medical Imaging
, pages 85–92. Springer, 2014.  [21] S. Jegou, M. Drozdzal, D. Vazquez, A. Romero, and Y. Bengio. The one hundred layers tiramisu: Fully convolutional densenets for semantic segmentation. pages 1175–1183, 2016.
 [22] K. H. Jin, M. T. Mccann, E. Froustey, and M. Unser. Deep convolutional neural network for inverse problems in imaging. IEEE Transactions on Image Processing A Publication of the IEEE Signal Processing Society, 26(9):4509–4522, 2016.
 [23] X. Li, H. Chen, X. Qi, Q. Dou, C. W. Fu, and P. A. Heng. Hdenseunet: Hybrid densely connected unet for liver and tumor segmentation from ct volumes. IEEE Transactions on Medical Imaging, PP(99):1–1, 2017.
 [24] G. Lin, A. Milan, C. Shen, and I. D. Reid. Refinenet: Multipath refinement networks for highresolution semantic segmentation. In Cvpr, volume 1, page 5, 2017.
 [25] G. Lin, C. Shen, A. V. D. Hengel, and I. Reid. Efficient piecewise training of deep structured models for semantic segmentation. pages 3194–3203, 2015.
 [26] J. Long, E. Shelhamer, and T. Darrell. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3431–3440, 2015.
 [27] A. Newell, K. Yang, and J. Deng. Stacked hourglass networks for human pose estimation. In European Conference on Computer Vision, pages 483–499. Springer, 2016.
 [28] K. Nguyen, A. Sarkar, and A. K. Jain. Structure and context in prostatic gland segmentation and classification. In International Conference on Medical Image Computing and ComputerAssisted Intervention, pages 115–123. Springer, 2012.
 [29] M. Rastegari, V. Ordonez, J. Redmon, and A. Farhadi. Xnornet: Imagenet classification using binary convolutional neural networks. In European Conference on Computer Vision, pages 525–542, 2016.
 [30] S. E. A. Raza, L. Cheung, D. Epstein, S. Pelengaris, M. Khan, and N. M. Rajpoot. Mimonet: A multiinput multioutput convolutional neural network for cell segmentation in fluorescence microscopy images. In 2017 IEEE 14th International Symposium on Biomedical Imaging (ISBI 2017), pages 337–340, April 2017.
 [31] O. Ronneberger, P. Fischer, and T. Brox. UNet: Convolutional Networks for Biomedical Image Segmentation. Springer International Publishing, 2015.
 [32] H. R. Roth, L. Lu, A. Farag, H.C. Shin, J. Liu, E. B. Turkbey, and R. M. Summers. Deeporgan: Multilevel deep convolutional networks for automated pancreas segmentation. In International conference on medical image computing and computerassisted intervention, pages 556–564. Springer, 2015.
 [33] S. Saxena and J. Verbeek. Convolutional neural fabrics. In Advances in Neural Information Processing Systems, pages 4053–4061, 2016.
 [34] S. Shah, P. Ghosh, L. S. Davis, and T. Goldstein. Stacked unets: A nofrills approach to natural image segmentation. 2018.
 [35] K. Sirinukunwattana, D. R. Snead, and N. M. Rajpoot. A novel texture descriptor for detection of glandular structures in colon histology images. In Medical Imaging 2015: Digital Pathology, volume 9420, page 94200S. International Society for Optics and Photonics, 2015.
 [36] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich. Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1–9, 2015.
 [37] C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2818–2826, 2016.
 [38] Z. Tang, X. Peng, S. Geng, L. Wu, S. Zhang, and D. Metaxas. Quantized densely connected unets for efficient landmark localization. In European Conference on Computer Vision (ECCV), 2018.
 [39] X. Xu, Q. Lu, L. Yang, S. Hu, D. Chen, Y. Hu, and Y. Shi. Quantization of fully convolutional networks for accurate biomedical image segmentation. Preprint at https://arxiv. org/abs/1803.04907, 2018.
 [40] M. Yang, K. Yu, C. Zhang, Z. Li, and K. Yang. Denseaspp for semantic segmentation in street scenes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 3684–3692, 2018.
 [41] W. Yang, S. Li, W. Ouyang, H. Li, and X. Wang. Learning feature pyramids for human pose estimation. In The IEEE International Conference on Computer Vision (ICCV), volume 2, 2017.
 [42] F. Yu and V. Koltun. Multiscale context aggregation by dilated convolutions. CoRR, abs/1511.07122, 2015.
 [43] H. Zhao, J. Shi, X. Qi, X. Wang, and J. Jia. Pyramid scene parsing network. In IEEE Conference on Computer Vision and Pattern Recognition, pages 6230–6239, 2017.
 [44] A. Zhou, A. Yao, Y. Guo, L. Xu, and Y. Chen. Incremental network quantization: Towards lossless cnns with lowprecision weights. 2016.
 [45] Z. Zhou, M. M. R. Siddiquee, N. Tajbakhsh, and J. Liang. Unet++: A nested unet architecture for medical image segmentation. In Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support, pages 3–11. Springer, 2018.
Comments
There are no comments yet.