1 Introduction
According to the International Agency for Research on Cancer [1], breast cancer is the most frequently diagnosed cancer and the second most fatal disease among women around the world. Breast masses are the most challenging breast abnormality to diagnose and the most feared, since about 90% of breast masses are cancerous. Although there are no effective methods for prevention, early intervention is critical for improving associated survival rates. Screening mammography is thus employed for invasive malignant tumours (measuring cm) when they are too small to be palpable or cause symptoms. However, the manual inspection of screening mammograms by radiologists is tedious, subjective, and prone to errors, which may cause high false positive rates and over diagnosis [2, 3, 4]. For these reasons, automatic and robust diagnosis tools for screening mammography are in high demand. Traditional mammographic computeraided diagnosis (CAD) systems rely heavily on elaborate handcrafted features [2]
. Recently, leveraging the insights from the successes of deep neural networks (deep learning) in computer vision tasks
[5, 6, 7, 8, 9], deep learning based algorithms have been applied to mammograms and have achieved stateoftheart results for mass detection, segmentation and classification. Mass detection aims to find the regions of interest (ROIs) where abnormalities may be located, and mass segmentation provides detailed morphological features with precise outlines within ROIs. Mass classification categorizes mammogram patches or full mammograms into benign or malignant, assisting radiologists to interpret screenings locally (regionbased mass classification) or globally (full mammogram classification). There are several recent examples which deep learning has been applied to full mammogram classification [5, 7, 8]. However, these works ignore the importance of local ROIs, since radiologists are more likely to identify local regions rather than labelling the full mammograms in clinical practice. In terms of regionbased mass classification, Convolutional Neural Networks (CNNs) have been applied as a classifier showing that deep learning models outperform traditional methods significantly
[3, 10]. The work in [11]utilized the CNN as a regressor to extract handcrafted features from both mammogram ROIs and annotation masks of radiologists, which were then followed by a Random Forest classifier model. Although
[11] results in better performance than direct CNN based algorithms, handcrafted features are still required in the final design.More recently, [12] proposed a powerful CNN based image segmentation model referred as UNet, which interlaces multiresolution information by adding skips between encoding and decoding layers of the same spatial size and has been shown to perform well on medical images. However, when applied to mammography, the UNet is limited by lowsignaltonoise data, which causes labelling inconsistency and incompleteness. To address this, a novel architecture is designed in this paper. Based on the accurate pixellevel labelling algorithm presented in [4] and the related breast mass classifiers [5, 7, 8, 3, 10, 11, 2], a Dualpath Conditional Residual Network (DualCoreNet) for mammography analysis is proposed as shown in Figure 1. Firstly, a mass and its context texture learner called the Locality Preserving Learner (LPL) is built with stacks of convolutional blocks, achieving a mapping from ROIs to class labels. Secondly, a graph inference layer, called the Conditional Graph Learner (CGL) is employed to learn the inputmask correlation, and the extracted segmentation features will be further used to improve the final mass classification performance. By integrating these two learning paths, DualCoreNet achieves the best mammography segmentation and classification simultaneously, outperforms recent stateoftheart models. The main contributions of this paper are the following: (i) To our knowledge, DualCoreNet is the first dualpath CNNbased mammogram analysis model that takes advantage of segmented mask for mass classification; (ii) Our model achieves stateoftheart results on both mass segmentation and mass classification tasks on publicly available mammography datasets.
2 Methodology
The DualCoreNet takes mammogram ROIs as input and outputs mass segmentation masks and ultimately classifies ROIs simultaneously. The pipeline of DualCoreNet is shown in Fig 2. We first define the notation used throughout the paper and then introduce the details of the two proposed learning paths and the whole DualCoreNet.
2.1 Notations
Given a mammogram lattice , let as one of the training mammogram ROIs, as the corresponding pixellevel annotations from radiologists and for the ROI class labels, the training set is represented by .
2.2 Locality Preserving Learner
Mass and its context tissue textures are exploited as the major classification features in traditional mammographic CADs [2]
. The LPL learns the hierarchical and local intrinsic texture features by nonlinear mapping locality information’s linear combination with stacks of depthwise separable convolution, maxpooling and ReLU activation.
In order to prevent the model from exploding and vanishing gradients, the residual learning [13] is employed within the whole DualCoreNet, which maps the convolutional layer’s output with residuals and the linear dimension matching kernel as:
(1) 
Depthwise separable convolutions, referred as ”SConv” in Fig 2, are widely applied to efficiently map the network by factoring filters into a series of operations, such as [14]. Specifically, depthwise separable convolution comprises of a spatial convolution that operates independently for each feature map and a pointwise convolution that computes a linear projection for the spatial convolution’s output. Inspired by this, residual learning paths with the depthwise separable convolutions are built in the DualCoreNetto independently learn crosschannel and spatial correlations of hierarchical but local intrinsic features from both mammogram ROIs and segmentation masks. The loss associated to the LPL layer is defined with categorical crossentropy as:
(2) 
where is the class indicator, is the parameter set of the LPL path.
2.3 Conditional Graph Learner
In practice, radiologists usually analysis mass shape and boundary information to improve cancer inspections, the more irregular the shape, the more likely to be cancerous [15]. This observation suggests that extracting segmentationfriendly features should improve mass classification performance. To do that, the CGL is proposed accordingly. Specifically, the CGL first employs the revised UNet, mapping ROIs to geometry and spatial features, then a graphical inference layer to preserve pixel consistency with conditional restrictions and finally the separable convolution stacks to efficiently learn hierarchical features and class labels mapping.
Firstly, the UNet makes use of standard convolutional layers and employs a combination of multiresolution filters. Incorporated with residual learning in the CGL, the th layer output with input at pixel is formulated as:
(3) 
where and denote the convolution functions of residuals and dimension mapping respectively, and represent the corresponding kernel sizes of convolutional filters, and
for kernel strides or maxpooling factors in the downsampling layers, and
for activation functions.
The graphical inference layer applies conditional random fields (CRFs) as an additional CNN layer [4, 16]
. The loss function for the segmentation is defined with weighted categorical crossentropy of the residual learning UNet and graphical inference layer as follows
(4) 
where
is the residual UNet output probability distribution at position
given the parameters , is the partition function of the CRF, is the pairwise potential function which is defined with the label compatibility, Gaussian kernels and corresponding weights for pixel and belonged to the CRF graph edges , and ^{1}^{1}1Optimally chosen as 0.67 with grid search in our experiments. is the tradeoff factor.The loss of the CGL path is again defined by the classification categorical crossentropy as:
(5) 
where is the resulted soft mass mask and is the parameter set of the CGL path.
Finally, by integrating CGL and LPL into a dualpath network model, the overall categorical crossentropy based loss of the DualCoreNet is defined as:
(6) 
3 Experiments
3.1 Datasets and ROIs selection
The DualCoreNet is evaluated on two public datasets: INbreast [17] and CBISDDSM [18]. INBreast is a fullfield digital mammographic (FFDM) dataset containing 116 pixellevel annotated masses in 107 mammograms, with pixel size and contrast resolution 14 bits. The CBISDDSM is a modernized subset of Digital Database for Screening Mammography (DDSM) [19], that includes 1594 mass contained digitized film screening mammograms.
Two manual ROIs selection procedures are adopted for the training stage in this work: One is to locate and extract rectangular mass contained bounding boxes, and they are utilized by the CGL to explore the boundary and shape features; the other ROI set is selected with proportional padding that masscentred ROIs include regions
times the size of the mass bounding box, and they are utilized by the LPL for mass and its context tissues texture feature learning. The selected ROIs are then augmented with horizontal and vertical flips. In terms of data division, the INbreast data set is divided by patients into a training set and a test set as 80%: 20%. As for the CBISDDSM, we adopted the predivided 1253 training and 354 test ROIs within the data.3.2 Implementation Details
As shown in Fig 2, the LPL path is designed with stacks of separable convolutional layers with kernel size for fine texture features learning. The first three blocks are equipped with increasing feature maps (128, 256, 728) and decreasing spatial size (, , ), while the consecutive eight blocks are of the same feature map (728) and the same spatial shape ().
The CGL path takes downsampled bounding box ROIs (with spatial size ) as the input and followed by four consecutive maxpooling and convolution blocks with feature map numbers 16, 32, 64 and 128 and corresponding spatial size , , and . The bottom latent features are then upsampled reversely with convolution operations and skips that connect previous layer channels. The multiresolution mapping is then followed by the CRF graphical inference layer, resulting in a sigmoid activated upsampled segmentation mask (size ). A relative large kernel is chosen for the appended CNN blocks, aiming to better learn the shape and boundary features.
After global averaging on both paths and two fully connected layers with 2048 neurons separately, the concatenated features of dual paths are classified into either one of the two mass categories (Benign or Malignant). Except for the separable convolutional blocks in the LPL path, which are initialized with ImageNet
[20] pretrained weights for an accelerate convergence rate and network generalization, all other network layers were randomly set. To avoid overfitting, dropout layers with dropout rate were used. The DualCoreNetis optimized by the Stochastic Gradient Descent algorithm with the Adam update rule. To fully train the
DualCoreNet, two component paths were first trained with their corresponding loss function (2) and (5) separately, and then further trained jointly with the loss (6).Methodology  Dataset  DI, % 

Dhungel et. al. [15]  INbreast  
Zhu et. al [21]  INbreast  
DualCoreNet  INbreast  
Dhungel et. al. [15]  CBISDDSM  
Zhu et. al [21]  CBISDDSM  
DualCoreNet  CBISDDSM 
3.3 Results
In terms of mass segmentation, evaluations of Dice Coefficients are compared with related works in Table 1. DualCoreNet achieves the best performance with 93.66% and 91.43% on INbreast and CBISDDSM, respectively. Visualized overlapping contours of the radiologist’s annotation (red lines) and DualCoreNet’s segmentation results (green lines) are shown in Figure 3. The DualCoreNet contributes similar to the radiologists and smooth contours even for relative hard cases.
As for the mass classification performance, the ROC curve and AUC of LPL, CGL and joined paths’ best model are shown in Figure 4. For INbreast, the dual path model achieved the best performance 0.93 AUC, 0.01 higher than LPL path only. With a guaranteed classification performance, the dual path model further contributes precise mass segmentation. For the CBISDDSM, the dualpath model outperforms either CGL or LPL, achieving 0.85 AUC. Moreover, as shown in Table 2, the DualCoreNet outperforms other stateofart algorithms. The experimental results show that the representation learned by DualCoreNet is more robust for both of breast mass segmentation and classification in mammography, which again demonstrate our motivation of dualpath learning.
4 Conclusions
In this paper, we proposed a novel DualCoreNet for improved mammogram image analysis. By integrating the conditional graph learner path and the locality preserving learner path, our DualCoreNet works in a simple but effective way to jointly learn segmentation and classification. Thanks to the departure from handcrafted features in classical CADs, our model is more flexible to learn mammographicalfriendly representations. Furthermore, DualCoreNet performs better on higher quality dataset (the INbreast dataset). Extensive experiments show that our method outperforms stateoftheart on both breast mass segmentation and classification tasks in mammography.
References
 [1] P. Boyle, B. Levin et al., World cancer report 2008. IARC Press, International Agency for Research on Cancer, 2008.
 [2] C. Varela, S. Timp, and N. Karssemeijer, “Use of border information in the classification of mammographic masses,” Physics in medicine & biology, vol. 51, no. 2, p. 425, 2006.
 [3] J. Arevalo, F. A. González, R. RamosPollán, J. L. Oliveira, and M. A. G. Lopez, “Representation learning for mammography mass lesion classification with convolutional neural networks,” Computer methods and programs in biomedicine, vol. 127, pp. 248–257, 2016.
 [4] H. Li, D. Chen, W. H. Nailon, M. E. Davies, and D. Laurenson, “Improved breast mass segmentation in mammograms with conditional residual unet,” in Image Analysis for Moving Organ, Breast, and Thoracic Images. Springer, 2018, pp. 81–89.
 [5] W. Zhu, Q. Lou, Y. S. Vang, and X. Xie, “Deep multiinstance networks with sparse label assignment for whole mammogram classification,” in International Conference on Medical Image Computing and ComputerAssisted Intervention. Springer, 2017, pp. 603–611.

[6]
D. Chen, J. Lv, and Z. Yi, “Graph regularized restricted boltzmann machine,”
IEEE transactions on neural networks and learning systems, vol. 29, no. 6, pp. 2651–2659, 2018.  [7] G. Carneiro, J. Nascimento, and A. P. Bradley, “Automated analysis of unregistered multiview mammograms with deep learning,” IEEE transactions on medical imaging, vol. 36, no. 11, pp. 2355–2365, 2017.
 [8] S. Shams, R. Platania, J. Zhang, J. Kim, and S.J. Park, “Deep generative breast cancer screening and diagnosis,” in International Conference on Medical Image Computing and ComputerAssisted Intervention. Springer, 2018, pp. 859–867.

[9]
D. Chen, J. Lv, and Z. Yi, “Unsupervised multimanifold clustering by learning
deep representation,” in
Workshops at the 31th AAAI conference on artificial intelligence (AAAI)
, 2017, pp. 385–391.  [10] T. Kooi, B. van Ginneken, N. Karssemeijer, and A. den Heeten, “Discriminating solitary cysts from soft tissue lesions in mammography using a pretrained deep convolutional neural network,” Medical physics, vol. 44, no. 3, pp. 1017–1027, 2017.

[11]
N. Dhungel, G. Carneiro, and A. P. Bradley, “The automated learning of deep features for breast mass classification from mammograms,” in
International Conference on Medical Image Computing and ComputerAssisted Intervention. Springer, 2016, pp. 106–114.  [12] O. Ronneberger, P. Fischer, and T. Brox, “Unet: Convolutional networks for biomedical image segmentation,” in International Conference on Medical image computing and computerassisted intervention. Springer, 2015, pp. 234–241.

[13]
K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image
recognition,” in
Proceedings of the IEEE conference on computer vision and pattern recognition
, 2016, pp. 770–778.  [14] F. Chollet, “Xception: Deep learning with depthwise separable convolutions,” arXiv preprint, pp. 1610–02 357, 2017.
 [15] N. Dhungel, G. Carneiro, and A. P. Bradley, “Deep learning and structured prediction for the segmentation of mass in mammograms,” in International Conference on Medical Image Computing and ComputerAssisted Intervention. Springer, 2015, pp. 605–612.

[16]
S. Zheng, S. Jayasumana, B. RomeraParedes, V. Vineet, Z. Su, D. Du, C. Huang, and P. H. Torr, “Conditional random fields as recurrent neural networks,” in
Proceedings of the IEEE international conference on computer vision, 2015, pp. 1529–1537.  [17] I. C. Moreira, I. Amaral, I. Domingues, A. Cardoso, M. J. Cardoso, and J. S. Cardoso, “Inbreast: toward a fullfield digital mammographic database,” Academic radiology, vol. 19, no. 2, pp. 236–248, 2012.
 [18] R. S. Lee, F. Gimenez, A. Hoogi, and D. Rubin, “Curated breast imaging subset of ddsm,” The Cancer Imaging Archive, 2016.
 [19] M. Heath, K. Bowyer, D. Kopans, R. Moore, and W. P. Kegelmeyer, “The digital database for screening mammography,” in Proceedings of the 5th international workshop on digital mammography. Medical Physics Publishing, 2000, pp. 212–218.
 [20] O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein et al., “Imagenet large scale visual recognition challenge,” International Journal of Computer Vision, vol. 115, no. 3, pp. 211–252, 2015.
 [21] W. Zhu, X. Xiang, T. D. Tran, G. D. Hager, and X. Xie, “Adversarial deep structured nets for mass segmentation from mammograms,” in Biomedical Imaging (ISBI 2018), 2018 IEEE 15th International Symposium on. IEEE, 2018, pp. 847–850.
Comments
There are no comments yet.