Classifying Mammographic Breast Density by Residual Learning

09/21/2018 ∙ by Jingxu Xu, et al. ∙ 0

Mammographic breast density, a parameter used to describe the proportion of breast tissue fibrosis, is widely adopted as an evaluation characteristic of the likelihood of breast cancer incidence. In this study, we present a radiomics approach based on residual learning for the classification of mammographic breast densities. Our method possesses several encouraging properties such as being almost fully automatic, possessing big model capacity and flexibility. It can obtain outstanding classification results without the necessity of result compensation using mammographs taken from different views. The proposed method was instantiated with the INbreast dataset and classification accuracies of 92.6 (Breast Imaging and Reporting Data System) category task and the two BI-RADS category task,respectively. The superior performances achieved compared to the existing state-of-the-art methods along with its encouraging properties indicate that our method has a great potential to be applied as a computer-aided diagnosis tool.



There are no comments yet.


page 4

page 5

page 6

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Breast cancer is a big heath threat[30, 5], the incidence of which has increased while its death rates have declined in all age groups in the past decades[5, 29]. This favorable trend of mortality reduction could be related to the improvements in the treatment of breast cancer and the widespread adoption of breast cancer screening techniques, especially mammography[29], for early diagnosis.

Mammography is the most common and efficient method for breast cancer screening. Clinical studies reported that compared to mammographic abnormalities (e.g masses, calcification, architectural distortion, asymmetries), the change of breast density is another important indicator of early breast cancer development[22, 23, 25]. However, inspection of the generated large quantities of mammographs by radiologists is tedious and subjective, which also suffers from the intra- and inter-radiologists reproducibility problem[11, 2].

The very first research on the importance of breast density began with Wolfe et al., who demonstrated the relationship between mammographic parenchymal patterns and the risk of developing breast cancer[33]. Following this, Boyd et al. showed a similar correlation between mammographic densities and breast cancer risks[4]. Inspired by these discoveries, a number of studies on breast density classification emerged. The American College of Radiology s (ACR) Breast Imaging Reporting and Data System (BI-RADS) groups breasts into four categories according to the density with BI-RADS I refers to the lowest densities and BI-RADS IV refers to the highest (BI-RADS I: fat breast (0-25%), BI-RADS II: fat with some fibroglandular tissue (26-50%), BI-RADS III: heterogeneously dense breast (51-75%), and BI-RADS IV: extremely dense breast (76-100%)). Female with extremely dense breasts (BI-RADS IV) have a 2-6 times higher risk of developing breast cancer than female with fatty breasts (BI-RADS I)[11, 14]. Therefore, breast density plays an important role in the early detection of breast cancer, and there is an urgent need of an automatic system which can accurately classify mammographic breast densities.

Initially, many studies measured breast density by quantifying the gray-level histograms of mammographs[15, 19, 36]. Subsequent studies found that it might be insufficient to classify breasts into the corresponding BI-RADS categories only based on the information of histograms. For example, the study by Oliver et al. illustrated that the four different categories are quite similar with regard to both the mean gray-level values and the shapes of the histogram.

To address this issue, researchers turned to applying traditional feature engineering methods to handle the breast density classification task. Bovis et al. got a 71.4% accuracy by using a classifier paradigm where a combination of the Fourier and discrete wavelet transforms was investigated on the first and second-order statistical features[3]. Oliver et al. extracted morphological and texture features from breast tissue regions which were segmented using a fuzzy c-means clustering technique, and these features were then treated as inputs for the breast density classifier[22]. Jensen et al. adopted the same breast tissue segmentation method but extracted the first and second-order statistical features as well as morphological features for the Mammographic Image Analysis Socity (MIAS) dataset[12]. These two studies achieved 86.0% and 91.4% breast density classification accuracies, respectively. Chen et al. evaluated different local features using texture representation algorithms. After that, they modelled mammographic tissue patterns based on the local tissue appearances in mammographs[7]. The work of Indrajeet et al. were based on ROIs manually extracted from image. Then, multi-resolution texture descriptors were extracted from 16 sub-band images which were obtained from second level decomposition through wavelet packet transform[16]. It could be concluded from these studies that the general procedure of breast density classification includes segmenting the breast area, designing and extracting breast density-related features, and inputting these features into different classifiers to predict the density categories. One major drawback of this procedure is that prior expert knowledge of the data and a hand-crafting process are necessary to calculate the quantitative features.

On the other hand, the development of deep learning fields offers a promising solution of using artificial neural networks to automatically extract features for medical image analysis

[18, 27, 28, 35]. Convolutional Neural Network (CNN) is one type of these networks that has shown excellent performances in image classification. CNN can learn highly nonlinear relationships between the inputs and outputs without human intervention. A number of studies have applied deep learning to mammographic related tasks, such as lesion detection, benign and malignant masses differentiation, microcalcification recognition, and their combinations[13, 31, 32, 34, 6]. In respect of the breast density classification, Mohamed et al. designed an eight-layers CNN to group the mammographs into two categories (scattered density and heterogeneously dense) as a simplification of the complicated four BI-RADS category task[20]. Similarly, Ahn et al. designed a CNN architecture to learn the image characteristics from mammographs and classify the corresponding breasts into dense and fatty tissues[1]. From these pioneer studies, we can find that few studies directly classified mammographs into the four BI-RADS categories. One possible reason is that the model capacity of CNN models applied were limited whose shallow network structures prevent it from obtaining enough meaningful and abstract features for accomplishing the difficult task.

Radiomics is an emerging method in recent years that works by extracting large amounts of advanced quantitative features from medical images and quantifying the predictive or prognosis relationships between images and medical outcomes according to the features[17, 26]. Nevertheless, the advantages of CNNs have not been fully integrated with the radiomics approach to solve the problems encountered during classifying mammographic breast densities into the four BI-RADS categories. Therefore, in this paper, we propose a CNN-based (residual learning)[10] radiomics method for the automatic extraction of high-throughput features from mammographs and the subsequent classification of the breast densities. Specifically, our contributions are threefold.

1. Our work demonstrates the first attempt of applying a deep CNN as a radiomics approach to automatically extract high-throughput, high-level, and high-abstract features from mammographs, which serves as the basis of an accurate classification model of mammographic breast densities.

2. In addition to the existing situation, where a two-category classification is studied, our proposed method can accurately classify mammographic breast densities strictly following the four BI-RADS categories. Moreover, our network possesses the capacity to learn deep features for accurate BI-RADS category classification from a single mammographic image. Result compensation from different views (such as the craniocaudal view and the mediolateral oblique view) is not required.

3. Our method could be treated as a baseline of mammographic breast density classification for clinical applications. Due to the large capacity of residual CNNs, our method could be easily adapted to new datasets with new experimental parameters through parameter fine-tuning.

The rest of this paper is organized as follows. Section II gives the detailed description of the dataset and an overview of the CNN methods based on residual learning and radiomics. In Section III, the proposed CNN architecture and the training details including parameter settings and implementation details are presented. The experimental results are introduced in Section IV, followed by a discussion and conclusion in Section V and VI.

2 Materials And Methods

2.1 Dataset

In this study, we evaluated our methods on the public available dataset, INbreast dataset[21], which contains 115 cases (410 images). Among the 115 cases, 90 cases are from women with both breasts affected (4 images per case) and 25 cases are from mastectomy patients (2 images per case). Two views for each breast were recorded, a craniocaudal (CC) view, which is a top to bottom view, and a mediolateral oblique (MLO) view, which is a side view. The dataset provides a breast density assessment of each mammograph with the corresponding labels of BI-RADS categories, which makes it suitable for our study. The mammographs were acquired on x-ray films and saved by the standard Digital Imaging and Communications in Medical (DICOM) format. The image matrix has either or pixels. Among the 409 images(1 missing the label), 136 are classified as BI-RADS I, 146 as BI-RADS II, 99 as BI-RADS III and 28 as class BI-RADS IV (example images of four categories are shown in Fig 1).

Figure 1: Example of mammographs of the different BI-RADS categories. (a) BI-RADS I, (b) BI-RADS II, (c) BI-RADS III, and (d) BI-RADS IV.

2.2 Data Preprocessing

As introduced in the dataset section, we have a total number of 409 images. As we all know, training CNN models requires a large number of data and data augmentation is critical step. We also observe a data imbalance between the four BI-RADS categories that needs to be dealt with. We first performed a four-fold rotation augmentation for the BI-RADS IV images. After that, we randomly separated all the images into three groups: 349 for training, 77 for validation and 95 for independent test. At last, to augment the training dataset, we further processed the training and validation sets through rotation by eight random angles, horizontal flip, and vertical flip. Therefore, we have a training dataset of 11168 () images and a validation dataset of 2464 ().

Another problem that needs to be resolved before the network training is the large size of each image. The original mammography images have or pixels. To reduce the computational load and memory usage, we need to downsample the original images. To reduce the computational load and memory usage, we downsampled the original images, i.e. resized the original images to pixels.

2.3 CNN-based residual learning for mammograph classification

CNNs are a class of deep learning methods that attempt to learn high-level features and attack the computer vision problems, such as classification, detection, segmentation, and so on. Gradient vanishing is a big problem for CNNs with deep layers. Thanks to the invention of the residual network, CNNs can go substantially deeper now than previous. A detailed description of the residual learning block will be presented here. The residual learning was used to solve the degradation problem after stacking a lot of convolution layers. We use

to denote the desired nonlinear output feature map of the input feature map x after applying the stacked layers. Now, we let the stacked nonlinear layers fit another mapping of:


and is recast to


The formulation of can be realized by feedforward CNN with shortcut connections or skip connections (Fig2). In this case, no extra parameters or computational burden are added to the training process. Due to the propagation of gradients through the shortcut connections, it is easier to optimize the residual mapping of than to optimize the original mapping of . Therefore, by adding residual learning block, deeper networks could be designed to extract richer information from images for our classification tasks.

Figure 2: Residual learning block.

Next, we will describe in detail the CNN method used for image classification. After preprocessing, the training, validation, and test datasets went through the training and the test stages respectively as shown in Fig3

. CNNs are trained by feedforward and backpropagation processes. The feedforward process extracts and selects the features and calculates the loss, whereas the backpropagation process optimizes the network parameters by gradient descent of the loss function.

Figure 3: Schematic diagram of residual learning for classification.

The feedforward process of CNNs could be interpreted by the following steps. Firstly, the images pass through the convolution layers.


Where denotes the layer number,

denotes the nonlinear activation (the rectified linear activiation (Relu) was used for this study),

and are weights and bias, * denotes the convolution operation, and denotes the feature maps with denotes the input. Some convolution layers are followed by downsampling procedure (average pooling layers).


For our classification task, a softmax activation was included after three fully connected layers.


where and is the output of last layer, So the final prediction from the network could be summarized as



consists of of all the network parameters to be estimated and optimized, C denotes the overall forward pass network and X means the input and X refers to the input.

On the other hand, the CNN backward process is the backward propagation of loss gradients, which tries to optimize the network parameters by addressing the following cross-entropy loss minimization problem


where I and K are the total number of classification categories and training samples respectively. is the manually labelled ground truth provided by the INbreast dataset.

After the training phase, a classification model is obtained with the trained parameters. For new independent samples, we can generate the probability distributions of each case by calculating


Then the BI-RADS categories of the mammograph images could be determined accordingly.

3 Experiments

3.1 CNN architecture

Figure 4: The CNN architecture.

For our classification task, we applied a 70 weight-layer CNN model. As shown in Fig4, the model could be divided into 3 stages and each stage has 7 residual learning blocks (only 3 residual learning blocks are shown in Fig4

). In total, the model has 70 weight layers (67 convolution layers and 3 fully connected layers) without counting the average pooling and batch normalization layers. All the convolution kernel size was set to

, and the numbers of convolution kernel for the three residual stages were set to 64, 128, and 256. Moreover, the Relu activation function was adopted following each convolution. The average pooling size and strides were set to

. For the last three fully connected layers, the number of neurons were respectively set to 1024, 512 and 4. The first two fully connection layers were set with the Relu activation function, while the last one used the softmax instead.

Different layers of the network mean we are extracting different levels of abstract information from the input samples. To test the sensitivity of our classification model to the CNN depth, another two CNN configurations were also evaluated, the 36 and the 48 weight-layer CNN models. These two models also have 3 stages but have fewer convolutional layers in each stage. The different configurations were compared to demonstrate the importance of high-level features on the final classification performance.

3.2 Parameter settings and implementation details

We used the Keras (a deep learning framework with tensorflow as backend) to implement our CNN networks for the breast density classification task. Tensorboard was adopted to monitor the entire training process, including the evolution of the accuracy and loss. The network training was implemented on a Dell-7910 workstation equipped with two E5-2640v4 Intel Haswells, a NVIDIA TITAN XP GPU and 64G memory. Adam was used for training, with batch size of 16, maximal number of iterations of 3200 and initial learning rate of 0.0001. Random values drew from the uniform distribution were used for the weight initialization and zero for the bias initialization.

4 Results

4.1 Training convergence property.

Minimizing the cross-entropy loss is the target of the network parameter optimization, and increasing the classification accuracy reflects the improved capability of a classification model to differentiate the different categories. Therefore, to monitor the convergence properties of our network during the training stage, we plotted out the loss as well as the accuracy curves of both the training dataset and the validation dataset with respect to the iterations (Fig. 5). These curves could present the detailed learning procedure of the network. Our loss results fluctuated stably around zero after 80 epochs (Fig

5), which demonstrates that the residual network training converged gradually. The small fluctuations might be caused by the differences between the samples. Similar phenomena could be observed from the accuracy curves with both training and validation curves showed small fluctuations around the accuracy value of 1 after 80 epochs (Fig5). These results prove that our network training could converge gradually. Once the network was trained, it could be used to obtain the classification predictions of new independent samples.

Figure 5: The plot of accuracy and loss in the training stage.

4.2 Classification performance of different network configurations

Models 36L 48L 70L
BI-RADS I 92.00% 88.00% 96.00%
BI-RADS II 88.46% 100.00% 96.15%
BI-RADS III 90.48% 71.20% 95.24%
BI-RADS IV 73.91% 73.91% 82.61%
ALL(accuracy) 86.32% 85.26% 92.63%

As explained in the Materials and Methods section, in order to test whether our network is sensitive to the depth of the residual network, we compared the classification results of our 70 weight-layer CNN model to those of the 36 and 48 weight-layer CNN models. Table I summarizes the classification accuracies of the three different network configurations. It could be found that the 36 and 48 weight-layer networks have similar overall classification accuracies. On the other hand, the 70 weight-layer network has a significant increased accuracy. CNN models with different depths can learn features of different hierarchies. We believe that the 70 weight-layer CNN model learned higher levels of features, which led to the improved performance compared to the 36 and 48 weight-layer models. One phenomenon we need to pay attention to is that all of the three networks showed much lower classification accuracies for the BI-RADS IV category, which might be caused by the data imbalance as we only have 28/410 original images that are classified as BI-RADS IV in the INbreast dataset.

4.3 Classification according to two categories vs four categories

Models 36L 48L 70L
Scattered density 94.12% 98.04% 100.00%
Heterogeneously dense 97.73% 86.36% 95.35%
ALL(accuracy) 95.79% 92.63% 96.84%

Many studies simplified the problem from the original four-category classification to a two-category classification case. In clinical applications, it is more challenging for radiologists to classify the breasts into the four BI-RADS categories correctly due to the difficulty of discerning the visual features of breast tissues between the four categories. Therefore, some studies treated BI-RADS I and BI-RADS II as one scattered density category and BI-RADS III and BI-RADS IV as one heterogeneously dense category in compliance with the clinical requirements.In this respect, we have made small changes to the original residual network to deal with the problem of dichotomous classification. The results are shown in the Table II. We can observe that all of our three residual network configurations showed good dichotomous classification performances, especially the 70 weight-layer residual network. Compared to the other two networks, the 70 weight-layer residual network reached a significantly higher overall classification accuracy of 96.84%.

4.4 Comparison with state-of-the-art methods

In order to further evaluate the proposed method, we compared it to two reported neural network based methods, the eight-layer convolutional neural network[20] and the high-throughput-derived multilayer visual representations (V1-like, HT-L2 and HT-L3)[9]. For the first method, the authors explored an eight-layer CNN to classify breasts between scattered density and heterogeneously dense. We applied some non-technical changes to make it possible to compare with our four-class task. For the second method, feature extractors (V1-like, HT-L2 and HT-L3) were first described by Cox et al. and Pinto et al

. for face recognition

[8, 24]. Fonseca et al. applied and evaluated the performance of these feature extractors on classifying mammographs into the four ACR composition categories[9]. We have made a comprehensive comparison of the different methods considering both two category and four-category classification problems.

From TABLE III and TABLE IV, it could be concluded that for both the two-category problem and the four-category problem, our proposed method always showed higher classification accuracies. One important reason could be that our network was deeper and could extract more abstract and deeper features, which is very important for the accurate classification of the different BI-RADS categories.

Models/Data Scattered density Heterogeneously dense ALL(accuracy)
V1-like 94.13% 81.82% 88.42%
HT-L2 96.08% 75.00% 86.32%
HT-L3 94.13% 65.10% 81.05%
8-CNN 96.08% 79.55% 88.42%
OUR 100.00% 95.35% 96.84%
V1-like 72.00% 76.92% 90.48% 56.62% 73.68%
HT-L2 68.00% 76.92% 76.20% 60.87% 70.53%
HT-L3 64.00% 76.92% 95.24% 52.17% 71.58%
8-CNN 76.00% 88.46% 85.71% 65.23% 78.95%
OUR 96.00% 96.15% 95.24% 82.61% 92.63%

5 Discussion

Traditional radiomics methods extract features based on manual observation and operation, including manual design, extraction and selection. Compared to the traditional feature engineering approach, deep convolutional networks with residual learning can automatically extract high-order, high-abstraction, and subtle features from mammograms that are not easily observable to human eyes, which enables accurate discrimination of the four BI-RADS categories. Moreover, by working with the whole original images, the classification model has access to all the image-relevant information and elevated performance could be expected. With the proposed method, an overall accuracy of 92.63% for the four BI-RADS category classification task and an accuracy of 96.84% for the two BI-RADS category classification task were obtained. Both were higher than the literature reported values, where only relatively shallow networks were applied.

An exam of breast cancer screening by mammography generally comes with CC and MLO views for a single breast. Multi-view models which make a classification decision by considering the different views have been reported. However, to accommodate the information from different views into the final prediction, different model parameter sets need to be trained accordingly, which leads to significantly increased computational burden and decreased testing speed. On the other hand, our proposed model has already shown excellent performance for the mammographic density classification task without considering the relationships between the different views. Therefore, we could conclude that the large capacity of our model enables the extraction of deep enough features for accurate BI-RADS category classifications of breasts which avoids the necessity of multi-view compensation.

Different imaging systems or experimental settings generate images of different standards. A trained CNN can only properly handle the domain-specific images. Though including different types of images into the training process can help build a more robust CNN model, it is not realistic to collect a dataset which considers all the different possibilities. Thanks to the large capacity of CNNs, our classification model could be easily extended depending on the application situations. If the dataset to be processed is in a similar domain as the original dataset, the trained CNN model could be used directly. However, if the new dataset is in a very different domain from the original dataset, fine-tuning of the trained CNN is required before it could be successfully applied. Compared with training from scratch, fine-tuning of CNNs requires much fewer samples and the training process is significantly faster.

Our residual learning-based CNN model could serve as a baseline for mammographic breast density classification. In the future, we expect to collect more data, especially those of BI-RADS IV category, to train a more powerful CNN model. We also plan to test the fine-tuning performance of the baseline model by using datasets that come from different systems or different experimental settings. Finally, we have already collected a number of clinical samples, we will try our method on these samples to investigate the domain transfer behavior of the established model. We will make our code and trained models publicly available once our manuscript gets accepted to foster the research in the field.

6 Conclusion

In this study, we have investigated the use of a radiomics-based method through residual learning for mammographic breast density classification. To the best of our knowledge, it is the first attempt that applies residual learning as a radiomics approach to extract high-throughput features from mammographs and classify the breasts accordingly. The superior classification accuracies achieved demonstrate its feasibility. Another important advantage of the proposed method is that the classification model is trained end-to-end. Sophisticated pre-processing of the mammographic images such as segmentation of the breast tissues is not required. This allows the proposed method to be automatic and almost no human intervention is needed. In addition, our method has an appealing attribute that it is readily extendable to different experimental settings and application situations. All of these encouraging properties make it a good candidate algorithm for CAD systems.


  • [1] C. K. Ahn, C. Heo, H. Jin, and J. H. Kim. A novel deep learning-based approach to high-accuracy breast density estimation in digital mammography. In Society of Photo-Optical Instrumentation Engineers, page 101342O, 2017.
  • [2] W. A. Berg, C. Campassi, P. Langenberg, and M. J. Sexton. Breast imaging reporting and data system: inter- and intraobserver variability in feature analysis and final assessment. Ajr American Journal of Roentgenology, 174(6):1769–77, 2000.
  • [3] K. Bovis and S. Singh. Classification of mammographic breast density using a combined classifier paradigm. International Workshop on Digital Mammography, pages 177–180, 2002.
  • [4] N. F. Boyd, J. W. Byng, R. A. Jong, E. K. Fishell, L. E. Little, A. B. Miller, G. A. Lockwood, D. L. Tritchler, and M. J. Yaffe. Quantitative classification of mammographic densities and breast cancer risk: results from the canadian national breast screening study. Journal of the National Cancer Institute, 87(9):670–675, 1995.
  • [5] F. Bray, P. Mccarron, and D. M. Parkin. The changing global patterns of female breast cancer incidence and mortality. Breast Cancer Research, 6(6):229–39, 2004.
  • [6] G. Carneiro, J. Nascimento, and A. P. Bradley. Automated analysis of unregistered multi-view mammograms with deep learning. IEEE Transactions on Medical Imaging, PP(99):1–1, 2017.
  • [7] Z. Chen, E. Denton, and R. Zwiggelaar. Local feature based mammographic tissue pattern modelling and breast density classification. In 2011 4th international conference on biomedical engineering and informatics, pages 351–355, 2011.
  • [8] D. Cox and N. Pinto. Beyond simple features: A large-scale feature search approach to unconstrained face recognition. In IEEE International Conference on Automatic Face & Gesture Recognition and Workshops, pages 8–15, 2011.
  • [9] P. Fonseca, J. Ferrer, J. Pinto, and B. Castaneda. Automatic breast density classification using a convolutional neural network architecture search procedure. In Medical Imaging 2015: Computer-Aided Diagnosis, 2015.
  • [10] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In

    IEEE Conference on Computer Vision and Pattern Recognition

    , pages 770–778, 2016.
  • [11] C. W. Huo, G. L. Chew, K. L. Britt, W. V. Ingman, M. A. Henderson, J. L. Hopper, and E. W. Thompson. Mammographic density a review on the current understanding of its association with breast cancer. Breast Cancer Research & Treatment, 144(3):479–502, 2014.
  • [12] R. Jensen, Q. Shen, and R. Zwiggelaar. Fuzzy-rough approaches for mammographic risk analysis. Intelligent Data Analysis, 14(2):225–244, 2010.
  • [13] C. Jin, H. Cai, J. Wang, L. Li, W. Tan, and Y. Xi. Discrimination of breast cancer with microcalcifications on mammography by deep learning. Scientific Reports, 6:27327, 2016.
  • [14] M. Kallenberg, K. Petersen, M. Nielsen, A. Ng, P. Diao, C. Igel, C. Vachon, K. Holland, N. Karssemeijer, and M. Lillholm. Unsupervised deep learning applied to breast density segmentation and mammographic risk scoring. IEEE Trans Med Imaging, 35(5):1322–1331, 2016.
  • [15] N. Karssemeijer. Automated classification of parenchymal patterns in mammograms. Physics in Medicine & Biology, 43(2):365, 1998.
  • [16] I. Kumar, H. S. Bhadauria, and J. Virmani. Wavelet packet texture descriptors based four-class BIRADS breast tissue density classification. Procedia Computer Science, 70:76–84, 2015.
  • [17] P. Lambin et al. Radiomics: Extracting more information from medical images using advanced feature analysis. European Journal of Cancer, 48(4):441–6, 2012.
  • [18] G. Litjens, T. Kooi, B. E. Bejnordi, S. Aaa, F. Ciompi, M. Ghafoorian, V. D. L. Jawm, G. B. Van, and C. I. S nchez. A survey on deep learning in medical image analysis. Medical Image Analysis, 42(9):60–88, 2017.
  • [19] K. E. Martin, M. A. Helvie, C. Zhou, M. A. Roubidoux, J. E. Bailey, C. Paramagul, C. E. Blane, K. A. Klein, S. S. Sonnad, and H. P. Chan. Mammographic density measured with quantitative computer-aided method: comparison with radiologists’ estimates and BI-RADS categories. Radiology, 240(3):656–65, 2006.
  • [20] A. A. Mohamed, W. A. Berg, H. Peng, Y. Luo, R. C. Jankowitz, and S. Wu. A deep learning method for classifying mammographic breast density categories. Medical Physics, 45(1), 2017.
  • [21] I. C. Moreira, I. Amaral, I. Domingues, A. Cardoso, M. J. Cardoso, and J. S. Cardoso. INbreast: toward a full-field digital mammographic database. Academic Radiology, 19(2):236–248, 2012.
  • [22] A. Oliver, J. Freixenet, R. Marti, J. Pont, E. Perez, E. R. E. Denton, and R. Zwiggelaar. A novel breast tissue density classification methodology. IEEE Transactions on Information Technology in Biomedicine A Publication of the IEEE Engineering in Medicine & Biology Society, 12(1):55, 2008.
  • [23] A. Oliver, M. Tortajada, X. Llad , J. Freixenet, S. Ganau, L. Tortajada, M. Vilagran, M. Sent s, and R. Mart . Breast density analysis using an automatic density segmentation algorithm. Journal of Digital Imaging, 28(5):1–9, 2015.
  • [24] N. Pinto, D. Doukhan, J. J. DiCarlo, and D. D. Cox. A high-throughput screening approach to discovering good forms of biologically inspired visual representation. PLOS Computational Biology, 5:1–12, 11 2009.
  • [25] A. Rampun, B. Scotney, P. Morrow, H. Wang, and J. Winder. Breast density classification using local quinary patterns with various neighbourhood topologies. 4(1):14, 2018.
  • [26] G. RJ, K. PE, and H. H. Radiomics: Images are more than pictures, they are data. Radiology, 278(2):563, 2016.
  • [27] D. Shen, G. Wu, and H. I. Suk. Deep learning in medical image analysis. Annual Review of Biomedical Engineering, 19(1):221–248, 2017.
  • [28] H. C. Shin, H. R. Roth, M. Gao, L. Lu, Z. Xu, I. Nogues, J. Yao, D. Mollura, and R. M. Summers.

    Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning.

    IEEE Transactions on Medical Imaging, 35(5):1285, 2016.
  • [29] E. A. Sickles. Breast cancer screening outcomes in women ages 40-49: clinical experience with service screening using modern mammography. Jnci Monographs, 22(22):99–104, 1997.
  • [30] R. Siegel et al. Cancer facts&figures 2017. American Cancer Society Availables:
  • [31] S. Suzuki, X. Zhang, N. Homma, K. Ichiji, N. Sugita, Y. Kawasumi, T. Ishibashi, and M. Yoshizawa. Mass detection using deep convolutional neural network for mammographic computer-aided diagnosis. In Society of Instrument and Control Engineers of Japan, pages 1382–1386, 2016.
  • [32] D. Wang, A. Khosla, R. Gargeya, H. Irshad, and A. H. Beck. Deep learning for identifying metastatic breast cancer. CoRR, abs/1606.05718, 2016.
  • [33] J. N. Wolfe. Risk for breast cancer development determined by mammographic parenchymal pattern. Cancer, 37(5):2486 C2492, 1976.
  • [34] L. Zhang, S. Jiang, Y. Zhao, J. Feng, B. W. Pogue, and K. D. Paulsen. Direct regularization from co-registered contrast MRI improves image quality of MRI-guided near-infrared spectral tomography of breast lesions. IEEE Transactions on Medical Imaging, PP(99):1–1, 2018.
  • [35] L. Zhang, L. Lu, R. M. Summers, E. Kebebew, and J. Yao. Convolutional invasion and expansion networks for tumor growth prediction. IEEE Transactions on Medical Imaging, 37(2):638, 2018.
  • [36] C. Zhou, H. P. Chan, N. Petrick, B. Sahiner, M. A. Helvie, M. A. Roubidoux, L. M. Hadjiiski, and M. M. Goodsitt. Computerized image analysis: estimation of breast density on mammograms. In Medical Imaging 2000: Image Processing, pages 1056–1069, 2000.