Breast density – the ratio between fibroglandular tissue (FGT) and breast tissue – is an important risk factor for breast cancer [1, 2, 3]. This density can be assessed on e.g. mammography and magnetic resonance imaging (MRI), and is typically scored by a radiologist as one of four incremental categories .
Computer algorithms have provided quantitative assessment of breast density on MRI. These algorithms often rely on identification of the breast volume by removing pectoral muscle and air, followed by segmentation of the FGT [5, 6, 7]. More recently, deep learning has been proposed for FGT segmentation [8, 9].
Deep learning could also be used to directly assess breast density without segmentation steps. In this case, it would, however, be desirable to interpret on what basis the algorithm gave its result. In this study, we propose a method to directly assess density and provide interpretations of these assessments.
2 Materials and methods
We used a regression convolutional neural network (CNN) to assess breast density, and examined why the CNN came to its result using Deep SHapley Additive exPlanations (SHAP) . The following paragraphs describe this in more detail.
We consecutively included 506 patients with early-stage unilateral invasive breast cancer. These patients received a preoperative T1-weighted MRI with imaging parameters: repetition time 8.1 ms, echo time 4.0 ms, flip angle 20, isotropic voxel size 1.35 1.35 1.35 mm3.
2.2 Data preparation and ground truth creation
For each patient, we removed field inhomogeneities using N4 biasfield correction . We normalized the MR image between zero and one based on the 2.5th and 97.5th intensity percentiles and clipped intensities outside that range. We extracted a slab of 20 sagittal slices around the center of each breast – yielding 20 240 breast slices – and resized them to 128 128 voxels.
2.3 Regression CNN
We used a regression convolutional neural network (CNN) to estimate the density per slice. This CNN consisted of five convolution layers with a 3 3 kernel size, a 2
2 stride, a rectified linear unit activation, 50% dropout, and batch normalization. These five layers were followed by two densely connected layers and an output node with a linear activation. We used the mean absolute percentage error as loss and an Adam optimizer with a learning rate of 0.001.
We split the slices on patient-level: 14 000 slices corresponding to 350 patients were used for training the CNN (100 epochs, mini-batches of 100 slices), 3 000 (75 different patients) for validation, and 3 240 (81 different patients) for independent testing. For each slice in testing, the CNN returned a density-value between 0 (fatty breast) and 1 (dense breast). The correlation of these density-values with the ground truth density was assessed using Spearman’s.
2.4 Interpretation of CNN results
We used Deep SHAP for interpretation of the CNN results . Deep SHAP is a combination of DeepLIFT and SHapley Additive exPlanations [19, 10]. To assess the background signal needed for the Deep SHAP analysis, we randomly sampled 100 training slices.
For each slice, Deep SHAP yields a map of SHAP-values. Each pixel in this SHAP-values map represents the contribution of that pixel to the final decision. Hence, a higher absolute value corresponds to the pixel being more important for the prediction.
3.1 Regression CNN
The density-values predicted by the CNN on the testing set were significantly correlated with the ground truth densities (N = 81 patients, Spearman’s , ).
3.2 Interpretation of CNN results
Inspection of the SHAP-value maps shows that in slices where the density predicted by the CNN matched the ground truth density, positive SHAP-values commonly occur in the glandular tissue, while negative SHAP-values occur in the fatty tissue (Figure 3). Voxels in the air, heart, or pectoral muscle are mostly ignored (Figure 3).
In slices where the predicted density deviated from the ground truth density, the SHAP-value maps are able to visualize where this deviation originated from. For example, in Figure 4B, the SHAP-value map shows overestimation in a patient with extremely dense breasts.
We presented a combination of deep learning regression and an interpretation method of this regression for density assessment of breast MRI. The regression method was significantly correlated with the ground truth density.
The interpretation method supported the predictions of the CNN regression by identifying which regions of the segmentation were important. Positive SHapley Additive exPlanations (SHAP)-values commonly occurred in the fibroglandular tissue (FGT), while negative SHAP-values occurred in the fatty tissue. This is as expected: more FGT in a breast means a higher density; while the same amount of FGT in a larger breast means a lower density.
Our regression method could be used as a stand-alone solution for density assessment. It does not need intermediate steps such as segmentation to assess breast density. If a radiologist chooses to know why the method came to its result, he or she can check the interpretation using the SHAP-values. The method could in principle also be used to confirm other density assessment methods.
Our method did not always coincide with the ground truth. This mainly occurred in patients who had variations in anatomy that were not common in the training set. Future work could mitigate this by using more data.
5 New or Breakthrough Work
We presented an interpretable deep learning regression method for breast density estimation on MRI with promising results.
This work was funded by the Dutch Cancer Society (KWF), grant number 10755.
-  Wolfe, J. N., “Breast patterns as an index of risk for developing breast cancer,” American Journal of Roentgenology 126(6), 1130–1137 (1976).
-  McCormack, V. A. and dos Santos Silva, I., “Breast density and parenchymal patterns as markers of breast cancer risk: a meta-analysis,” Cancer Epidemiology and Prevention Biomarkers 15(6), 1159–1169 (2006).
-  Boyd, N. F., Guo, H., Martin, L. J., Sun, L., Stone, J., Fishell, E., Jong, R. A., Hislop, G., Chiarelli, A., Minkin, S., et al., “Mammographic density and the risk and detection of breast cancer,” New England Journal of Medicine 356(3), 227–236 (2007).
-  Morris, E., Comstock, C., Lee, C., Lehman, C., Ikeda, D., Newstead, G., et al., “ACR BI-RADS® atlas, breast imaging reporting and data system,” Reston, VA: American College of Radiology , 56–71 (2013).
-  Wei, J., Chan, H.-P., Helvie, M. A., Roubidoux, M. A., Sahiner, B., Hadjiiski, L. M., Zhou, C., Paquerault, S., Chenevert, T., and Goodsitt, M. M., “Correlation between mammographic density and volumetric fibroglandular tissue estimated on breast MR images,” Medical Physics 31(4), 933–942 (2004).
-  Nie, K., Chen, J.-H., Chan, S., Chau, M.-K. I., Yu, H. J., Bahri, S., Tseng, T., Nalcioglu, O., and Su, M.-Y., “Development of a quantitative method for analysis of breast density based on three-dimensional breast MRI,” Medical Physics 35(12), 5253–5262 (2008).
-  Wu, S., Weinstein, S. P., Conant, E. F., and Kontos, D., “Automated fibroglandular tissue segmentation and volumetric density estimation in breast MRI using an atlas-aided fuzzy C-means method,” Medical Physics 40(12), 122302 (2013).
-  Dalmış, M. U., Litjens, G., Holland, K., Setio, A., Mann, R., Karssemeijer, N., and Gubern-Mérida, A., “Using deep learning to segment breast and fibroglandular tissue in MRI volumes,” Medical Physics 44(2), 533–546 (2017).
-  Ivanovska, T., Jentschke, T. G., Daboul, A., Hegenscheid, K., Völzke, H., and Wörgötter, F., “A deep learning framework for efficient analysis of breast volume and fibroglandular tissue using MR data with strong artifacts,” International Journal of Computer Assisted Radiology and Surgery , 1–7 (2019).
-  Lundberg, S. M. and Lee, S.-I., “A unified approach to interpreting model predictions,” in [Advances in Neural Information Processing Systems ], 4765–4774 (2017).
-  Tustison, N. J., Avants, B. B., Cook, P. A., Zheng, Y., Egan, A., Yushkevich, P. A., and Gee, J. C., “N4ITK: improved N3 bias correction,” IEEE transactions on medical imaging 29(6), 1310 (2010).
-  Van der Velden, B. H., Dmitriev, I., Loo, C. E., Pijnappel, R. M., and Gilhuijs, K. G., “Association between parenchymal enhancement of the contralateral breast in dynamic contrast-enhanced MR imaging and outcome of patients with unilateral invasive breast cancer,” Radiology 276(3), 675–685 (2015).
-  Knuttel, F. M., Van der Velden, B. H., Loo, C. E., Elias, S. G., Wesseling, J., van den Bosch, M. A., and Gilhuijs, K. G., “Prediction model for extensive ductal carcinoma in situ around early-stage invasive breast cancer,” Investigative Radiology 51(7), 462–468 (2016).
-  Van der Velden, B. H., Elias, S. G., Bismeijer, T., Loo, C. E., Viergever, M. A., Wessels, L. F., and Gilhuijs, K. G., “Complementary value of contralateral parenchymal enhancement on dce-mri to prognostic models and molecular assays in high-risk er+/her2- breast cancer,” Clinical Cancer Research 23(21), 6505–6515 (2017).
-  Van der Velden, B. H., Sutton, E. J., Carbonaro, L. A., Pijnappel, R. M., Morris, E. A., and Gilhuijs, K. G., “Contralateral parenchymal enhancement on dynamic contrast-enhanced MRI reproduces as a biomarker of survival in ER-positive/HER2-negative breast cancer patients,” European Radiology 28(11), 4705–4716 (2018).
-  Van der Velden, B. H., Bismeijer, T., Canisius, S., Loo, C. E., Lips, E. H., Wesseling, J., Viergever, M. A., Wessels, L. F., and Gilhuijs, K. G., “Are contralateral parenchymal enhancement on dynamic contrast-enhanced MRI and genomic ER-pathway activity in ER-positive/HER2-negative breast cancer related?,” European Journal of Radiology 121, 108705 (2019).
-  De Vos, B. D., Viergever, M. A., De Jong, P. A., and Išgum, I., “Automatic slice identification in 3D medical images with a ConvNet regressor,” in [Deep Learning and Data Labeling for Medical Applications ], 161–169, Springer (2016).
-  Kingma, D. P. and Ba, J., “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980 (2014).
Shrikumar, A., Greenside, P., and Kundaje, A., “Learning important features
through propagating activation differences,” in [
Proceedings of the 34th International Conference on Machine Learning-Volume 70], 3145–3153, JMLR. org (2017).