Early localization of prostate cancer from MRI is crucial for successful diagnosis and local therapy. However, subtle differences between benign conditions and cancer on MRI often make human interpretation challenging, leading to missed diagnoses and an alarming variability in radiologist interpretation. Human interpretation of prostate MRI suffers from low inter-reader agreement (0.46-0.78)[barentsz2016synopsis] and high variability in reported sensitivity (58-98%) and specificity (23-87%) [ahmed2017diagnostic].
Predictive models can help standardize radiologist interpretation, but current models [viswanath2012central, sumathipala2018prostate, litjens2014computer, armato2018prostatex, viswanath2019comparing, cao2019joint] often learn from MRI only, without considering the disease pathology characteristics. These approaches derive MRI features that are agnostic to the biology of the tumor. Moreover, current predictive models mostly use inaccurate labels (either from biopsies [armato2018prostatex] that suffer from sampling errors, or cognitive registration of pre-operative MRI with digital histopathology images of surgical specimens, where a radiologist retrospectively outlines the lesions on MRI [sumathipala2018prostate]
). MRI under-estimates the tumor size[priester2017magnetic], making outlines on MRI alone insufficient to capture the entire extent of disease. Furthermore, it is challenging to outline the ~20% of tumors that are not clearly seen on MRI, even when using histopathology images as reference [barentsz2016synopsis]
. These MRI-based models use a variety of techniques including traditional classifiers with hand-crafted and radiomic features[viswanath2012central, litjens2014computer, viswanath2019comparing]
, as well as deep learning based models[sumathipala2018prostate, cao2019joint]. The current state-of-the-art approach [sumathipala2018prostate]
to predict a cancer probability map for the entire prostate uses a Holistically Nested Edge Detection (HED)[HED] algorithm.
In this paper, we propose CorrSigNet, a two-step approach for predicting prostate cancer using MRI. First, CorrSigNet leverages spatially aligned radiology and histopathology images of prostate surgery patients to learn MRI cancer signatures that correlate with features extracted from the histopathology images. Second, CorrSigNet uses these correlated MRI signatures to train a predictive model for localizing cancer when histopathology images are not available, e.g. before surgery. This approach enables learning MRI signatures that capture tumor biology information from surgery patients with histopathology images, and then translating those learned signatures for prediction in patients without surgery/biopsy. Prior studies lack such correlation analysis of the two modalities. Our approach shows improved prostate cancer prediction compared to the current state-of-the-art method[sumathipala2018prostate].
2 Proposed Method
We used 95 prostate surgery patients with pre-operative multi-parametric MRI (T2-weighted and Apparent Diffusion Coefficient) and post-operative digitized histopathology images. Custom 3D printed molds were used to ensure that excised prostate tissue was sectioned in the same plane as the T2-weighted (T2W) MRI. An expert pathologist annotated cancer on the histopathology images. We spatially aligned the pre-operative MRI and digitized histopathology images of the excised tissue via the state-of-the-art RAPSODI registration platform [mirabelaregistration]. RAPSODI achieved a Dice similarity coefficient of 0.980.01 for the prostate, prostate boundary Hausdorff distance of 1.710.48 mm, and urethra deviation of 2.911.25 mm between registered histopathology images and MRI. Such careful registration of radiology and pathology images of the prostate enabled (1) correlation analysis of the two modalities at a pixel-level, and (2) accurate mapping of cancer labels from pathology to radiology images. We considered multiple slices per patient (average 7 slices/patient) irrespective of cancer size. Slices with missing cancer annotations were discarded during training. The dataset included some patients with cancer that had extra prostatic extensions, but our analysis was focused only on cancers inside the prostate.
2.2 Data Pre-processing
We smoothed the histopathology images with a Gaussian filter with
to prevent downsampling artifacts, padded and then downsampled them to, resulting in an X-Y resolution of . We projected and resampled the T2W and ADC images, prostate masks, and cancer labels on the corresponding downsampled histopathology images, such that they also had the same X-Y resolution of . This ensured that each pixel in each modality represented the same physical area.
Since MRI intensities vary significantly between scanners and scanning protocols, we standardized the T2W and ADC intensities using the histogram alignment approach proposed by Nyúl et al. [nyul2000new]
. We used prostate masks to standardize intensities within the prostate, and then applied the learned transformation to the image region beyond the prostate. After intensity standardization, we normalized the intensities to have zero mean and standard deviation of 1.
We randomly split the 95 patients to create our train, validation, and test sets with 66, 9, and 20 patients respectively. After horizontal flipping based data augmentation, the train and validation sets had 700 and 106 slices respectively. The test set included 139 slices, 24 cancerous lesions, 1.12M pixels in the prostate with cancer pixels. We performed MRI scale standardization on the train set, and used the learned histograms to standardize the validation and test sets. We followed a similar strategy for MRI intensity normalization.
2.3 Learning correlated features
Feature extraction: We extracted features from the T2W, ADC, and histopathology images by passing them through the first two convolutional layers of a pre-trained VGG-16 architecture [simonyan2014very]. Thus, each image yielded a representation, generating 64 features per pixel. We sampled pixels from within the prostate, and concatenated the T2W and ADC features to form the MRI representation per pixel. Thus, for each pixel, we had the MRI representation and the histopathology representation .
Common Representation learning: We trained a Correlational Neural Network architecture (CorrNet) [chandar2016correlational] to learn common representations from MRI and histopathology features per pixel. Given N pixels, each pixel input to the CorrNet model had two views: the MRI feature representation for pixel , , and the histopathology feature representation for pixel , . We used a fully-connected CorrNet model with a single hidden layer, where the hidden layer was computed as: H(Z_i) = W R_i + V P_i + b where , and . The reconstructed output was computed from the hidden layer as: Z_i’=[W’H(Z_i),V’H(Z_i)]+b’ where , and
. In contrast to the original CorrNet model, we did not use any non-linear activation function. We learned the parametersof the system by minimizing the following objective function, as detailed in [chandar2016correlational]: J(θ) = ∑_i=1^N [L(Z_i,H(Z_i)) +L(Z_i, H(R_i))+L(Z_i,H(P_i)) -λcorr(H(R_i),H(P_i)) ]
= ∑i=1N[(H(Ri)-¯H(R))(H(Pi)-¯H(P)]∑i=1N(H(Ri)-¯H(R))2∑i=1N(H(Pi)-¯H(P))2 where is the reconstruction error, is the scaling parameter to determine the relative weight of the correlation error with respect to the reconstruction errors,
is the mean hidden representation of theview and is the mean hidden representation of the view. Thus, the CorrNet model (i) minimizes the self and cross reconstruction errors, and (ii) maximizes the correlation between the hidden representations of the two views. Training CorrNet using pixel representations from within the prostate gave ample training samples to optimize the model, and to learn differences between cancer and non-cancer pixels.
After the CorrNet model was trained, we used the learned weights to project the MRI feature representations onto the dimensional hidden space to form CorrNet representations of the input MRI. The CorrNet representations are correlated with the corresponding histopathology features, and once trained, can be constructed even in the absence of histopathology images. Figure 1 shows the pipeline for learning common representations.
Training: From the 66 patients in the training cohort, we sampled all the cancer pixels from within the prostate, and randomly sampled an equal number of non-cancer pixels, also from within the prostate, thereby generating a training set of pixels, with equal number of cancer and non-cancer pixels. This ensured that we train the CorrNet with a balanced dataset of two classes. We used to weigh the cross-correlation error higher than the reconstruction errors. We chose a squared error loss for the reconstruction errors. We trained the CorrNet model with varying hidden layer dimensions, namely: . For each , we used a learning rate
, and 300 training epochs. Figure2 shows CorrNet representations of an example MRI slice, with .
2.4 Prediction of prostate cancer extent
We modified the Holistically Nested Edge Detection (HED) architecture [HED] to predict cancer probability maps for the entire prostate. We considered two modified versions of HED: (1) HED-3, and (2) HED-branch-3. The HED-3 model evaluates how well CorrNet representations alone perform in predicting cancer, while the HED-branch-3 model evaluates how well CorrNet representations combined with T2W and ADC images perform in predicting cancer. We represent our model using correlated feature learning and HED-3 as CorrSigNet(), and our model with correlated feature learning and HED-branch-3 as CorrSigNet(T2W, ADC, ), where is the CorrNet feature dimension. For example, CorrSigNet() uses only 5 correlated features for prediction, whereas CorrSigNet (T2W, ADC, ) uses the normalized T2W and ADC intensities in addition to 5 correlated features for prediction. We chose a prediction model similar to the HED architecture because it is known to learn and combine multi-scale and multi-level features, and has been successfully applied to anatomy segmentations from CT scans [harrison2017progressive, roth2016spatial, nogues2016automatic], and to prostate cancer prediction [sumathipala2018prostate].
In HED-3, we input three adjacent CorrNet slice representations of the prostate and output predictions for only the central slice. This ensured that the 2D-HED model learned the 3D volumetric continuity from MRI/ histopathology/ correlated features. This also helped in reducing false positive rates.
In HED-branch-3 (shown in Figure 3), we combined the CorrNet slice representations together with the normalized T2W and ADC images as inputs to the model. Similar to HED-3, we considered three adjacent slices for each input sequence (T2W, ADC, CorrNet representations
), and predicted cancer probability maps for the central slice only. However, in HED-branch-3 model, we processed each input sequence independently using the first three blocks, concatenated the three outputs from the three independent blocks, and processed the concatenated output using the next 2 blocks. Since the input sequences are processed independently in the first three blocks, we had a total of 11 side outputs, which were fused together using a Conv-1D layer to form the weighted fused output. We computed balanced cross-entropy losses for each of the 12 outputs (11 side outputs and 1 fused output) while training the architecture, but computed evaluation metrics only on the fused output. We usedkernels for all convolution layers except the last Conv-1D layer. The number of filters in each layer is stated in the legend in Figure 3sumathipala2018prostate] which used Batch Normalization in each layer. No post-processing steps were performed on the prediction maps.
Training: We trained both models using an Adam optimizer with an initial learning rate , weight decay , epochs = 200 and early stopping.
3 Experimental Results
Quantitative Evaluation: We quantitatively evaluated our models on a per-pixel and a per-lesion basis, with ground truth labels derived from pathologist cancer annotations on registered histopathology images. For a direct comparison, we reproduced the current state-of-the-art model [sumathipala2018prostate] to the best of our understanding, and computed both pixel-level and lesion-level evaluation metrics of this model on our test data (20 patients, 139 slices, 24 cancerous lesions, 1.12M pixels in the prostate). It may be noted that the AUC numbers reported in [sumathipala2018prostate] are computed on a lesion level, and not on a pixel-level. Our pixel-level metrics including all pixels within the prostate provide a more rigorous evaluation.
|HED [sumathipala2018prostate] (current state-of-the-art)||0.75||0.74||0.80|
|CorrSigNet(T2W, ADC, 1)||0.73||0.79||0.83|
|CorrSigNet(T2W, ADC, 3)||0.70||0.85||0.86|
|CorrSigNet(T2W, ADC, 5)||0.81||0.72||0.86|
|CorrSigNet(T2W, ADC, 15)||0.78||0.78||0.86|
|CorrSigNet(T2W, ADC, 30)||0.83||0.71||0.86|
Pixel level analysis: We tested the performance of the CorrSigNet models with different inputs and varying CorrNet feature dimension using the following pixel-level evaluation metrics (computed using 1.12M pixels in the prostate): sensitivity, specificity, and AUC of the ROC curve, with a probability threshold of 0.5. We note from Table 1 that CorrSigNet performs better than [sumathipala2018prostate], with consistently higher AUC numbers in pixel-level analysis. The sensitivity and specificity numbers vary within the models. Our tests showed that at least 3 CorrNet features were necessary for improved performance over MRI alone. We chose CorrSigNet (T2W, ADC, ) as the optimum model, because it had high sensitivity, specificity and AUC, with an optimum number of parameters. Between false positives and false negatives, we note that a false negative is more detrimental than a false positive in the task of cancer prediction.
Lesion level analysis: We performed lesion-level analysis using the evaluation method detailed in [sumathipala2018prostate] and found that CorrSigNet(T2W, ADC, ) achieved a per-lesion AUC of compared to a per-lesion AUC of by [sumathipala2018prostate] on the same test set.
Qualitative Evaluation: Figure 4 shows the same slice as in Figure 2 with aligned T2W, ADC, and histopathology images, and prediction results using current state-of-the-art method [sumathipala2018prostate], our CorrSigNet() and CorrSigNet(T2W, ADC, ) models. It may be noted that [sumathipala2018prostate] fails to detect the cancerous regions on the left and right of the images, while the CorrNet representations alone can identify the cancer regions, and when combined with T2W and ADC images, they predict the cancer regions with high probability. It may also be noted that CorrSigNet(T2W, ADC, ) shows fewer false positives than [sumathipala2018prostate]. This example shows the strength of learning correlated MRI signatures in identifying subtle, and sometimes MRI-invisible cancers. Figure 5 shows more example slices from different patients, comparing the state-of-the-art approach [sumathipala2018prostate] and our prediction results with CorrSigNet(T2W, ADC, ). We note that our model with correlated features (1) can identify subtle and smaller cancer regions, (2) have better overlap with ground truth cancer labels, and (3) have fewer false positives.
In this paper, we presented a novel method to learn correlated signatures of cancer from spatially aligned MRI and histopathology images of prostatectomy surgical specimens, and then use these learned correlated signatures in predicting prostate cancer extent from MRI. Quantitatively, our method improved performance of automated prostate cancer localization (per-pixel AUC of 0.86, per-lesion AUC of ), as compared to the current state-of-the-art method [sumathipala2018prostate] (per-pixel AUC 0.80, per-lesion ). Qualitatively, we found that correlated features could capture subtle cancerous regions and sometimes MRI-invisible cancers, had better overlap with ground truth labels, and fewer false positives. Correlated features have the capability of capturing tumor biology information from histopathology images in an unprecedented way, and these features, once learned, can be extracted in patients without histopathology images. In future work, we intend to conduct experiments with augmented datasets and in a cross-validation framework to boost the performance of our models.