Log In Sign Up

Lung Cancer Lesion Detection in Histopathology Images Using Graph-Based Sparse PCA Network

by   Sundaresh Ram, et al.
University of Michigan
The University of Arizona

Early detection of lung cancer is critical for improvement of patient survival. To address the clinical need for efficacious treatments, genetically engineered mouse models (GEMM) have become integral in identifying and evaluating the molecular underpinnings of this complex disease that may be exploited as therapeutic targets. Assessment of GEMM tumor burden on histopathological sections performed by manual inspection is both time consuming and prone to subjective bias. Therefore, an interplay of needs and challenges exists for computer-aided diagnostic tools, for accurate and efficient analysis of these histopathology images. In this paper, we propose a simple machine learning approach called the graph-based sparse principal component analysis (GS-PCA) network, for automated detection of cancerous lesions on histological lung slides stained by hematoxylin and eosin (H E). Our method comprises four steps: 1) cascaded graph-based sparse PCA, 2) PCA binary hashing, 3) block-wise histograms, and 4) support vector machine (SVM) classification. In our proposed architecture, graph-based sparse PCA is employed to learn the filter banks of the multiple stages of a convolutional network. This is followed by PCA hashing and block histograms for indexing and pooling. The meaningful features extracted from this GS-PCA are then fed to an SVM classifier. We evaluate the performance of the proposed algorithm on H E slides obtained from an inducible K-rasG12D lung cancer mouse model using precision/recall rates, F-score, Tanimoto coefficient, and area under the curve (AUC) of the receiver operator characteristic (ROC) and show that our algorithm is efficient and provides improved detection accuracy compared to existing algorithms.


page 1

page 6

page 7


Hyper-Heuristic Algorithm for Finding Efficient Features in Diagnose of Lung Cancer Disease

Background: Lung cancer was known as primary cancers and the survival ra...

PCANet: A Simple Deep Learning Baseline for Image Classification?

In this work, we propose a very simple deep learning network for image c...

Lung airway geometry as an early predictor of autism: A preliminary machine learning-based study

The goal of this study is to assess the feasibility of airway geometry a...

EGFR Mutation Prediction of Lung Biopsy Images using Deep Learning

The standard diagnostic procedures for targeted therapies in lung cancer...

Multi-resolution Super Learner for Voxel-wise Classification of Prostate Cancer Using Multi-parametric MRI

While current research has shown the importance of Multi-parametric MRI ...

I Introduction

Lung cancer is the leading cause of cancer-related deaths worldwide, with an estimated 1.6 million deaths each year

[cruz11lung]. Development of novel therapies to battle lung cancer has been greatly aided by the emergence of genetically engineered mouse models (GEMMs) of lung cancer, such as the K-ras; p53 non–small-cell lung carcinoma (NSCLC) model, where the compound effect of conditional mutations in the K-ras oncogene and the p53 tumor suppressor gene leads to development of adenocarcinomas in the mouse lung [walrath10genetically, barck15quantification]. Since GEMMs recapitulate certain aspects of the human disease associated with the stroma, vascularity, and immune infiltrate better than other models, it is important to be able to detect, identify and localize the lung tumor lesions seen on the histopathological sections as shown in Fig. 1.

Fig. 1: An example whole-slide histopathological image from our dataset consisting of many tumor lesions. The high-resolution inset images show the visual features that characterize the tumor (red frame) and normal (blue frame) regions.

Manual assessment of tumor burden (the amount of tumor cells/mass present in a subject’s body) on histopathological mouse lung sections is difficult, time consuming, and a labor-intensive process. This is due to various reasons such as fluctuating intensities [ram13symmetry], color change and morphological variations within structures of the cancer lesions in these images [lin19fast], tumor heterogeneity [junttila13influence] (see Fig. 1

), low signal-to-noise-ratio

[ram10seg, ram16size], variations in illumination [ram18three], microscopy imaging limitations [ram12size, ram2017sparse, ram18classify, ram20combined], and the large number of images and the number of lesions per image an expert has to demarcate. Moreover, the task of manual detection of cancer lesions on H&E slides can be subjective, leading to inter-observer variability. Therefore, there is a pressing need for computer-aided diagnostic tools for accurate and efficient quantitative analysis of histopathology images [gurcan09histo, veta14breast, xing16robust, ram21detect].

Tumor detection and classification tools within the commonly available microscopy software are based on feature extraction techniques such as size, shape, and morphological features [gurcan09histo, basavanhally13multi, gorelick13pros, veta14breast, xing16robust, ram16size, tizhoosh18represent], texture features including local binary pattern (LBP) [reis17auto, wan17integrated, simon18multi]

, local Fourier transform

[kong11parti], co-occurrence matrix and fractal texture features [alinsaif20part], and energy minimization and optimization-based techniques [tosun11graph, ozdemir13hybrid, bejnordi16automated, javed20multi]. These techniques suffer considerably due to over-generalization and therefore need extensive customization for the dataset at hand, limiting their use to very simple images obtained/collected in a carefully constrained environment [ram16size]. Tumor detection and grading using size, shape and other morphological features does not work well when the cell population exhibits a variety of sizes and shapes, or when the signal-to-noise (SNR) ratio is poor [shi17histo]. Energy minimization and optimization techniques minimize the internal energy within tumor areas for their accurate detection, but may lead to false detections for highly textured and heterogeneous tumor lesions. To overcome these limitations, existing software tools allow user-friendly interfaces to correct the results obtained. This, however, results in losing the benefits of automation such as speed and reproducibility.

There has been much interest in developing algorithmic methods that adapt naturally to the dataset and perform feature discovery. One such popular class of learning or feature discovery methods includes those based on sparse representation-based classification (SRC) [wright09robust]. There have been many SRC methods that have been successfully applied to a variety of histopathological image classification problems [srinivas14simul, vu16histo, sarkar18sdl, li20anal]

. These methods are based on finding linear representations in the data. However, linear representations are almost always inadequate for representing non-linear structures of the data which arise in many practical applications. A recent class of learning-based methods involve the design of deep neural networks that can be trained to learn relevant features by themselves. There have been plenty of deep learning methods that have been developed for histopathological image classification

[hou16patch, xu17large, tellez18whole, lin19fast, xing19pixel, campanella19clinical, wei19patho, valkonen20cyto]. The success of deep learning, however, has been fueled by the availability of generous and clean training data. When the training data is limited and/or noisy, as is often the case in medical imaging, these methods tend to show a performance degradation [goodfellow16deep]. Another class of learning-based approaches involve orthogonal transformation of the data such as principal component analysis (PCA) transform to extract relevant features for image classification [chan15pcanet, bruna13invariant, shi17histo, dutta20sparse]. These learning-based approaches using orthogonal transformation explore the data distribution to preserve global structures in the data.

In this paper, we present a simple machine learning approach called the graph-based sparse principal component analysis (GS-PCA) network, which combines the local and global structures of all the data and is implemented in a deep learning framework to learn an explicit nonlinear mapping of the data for accurate detection and classification. We use the most basic and easy operations to emulate the processing stages in a typical (convolutional) neural network: First, graph-based sparse PCA filters are used as the data-adapting convolutional filter bank at each stage of the network. Next, we perform a simple binary quantization (hashing) that serves as the nonlinear stage, followed by block-wise histograms of the binary codes as the feature pooling stage to obtain the final output features of the network. Finally, we train a support vector machine (SVM) classifier on the output features of the network to obtain the final classification instead of the regular softmax classifier, as the softmax classifier known to overfit

[chan15pcanet]. For ease of reference, we call this data-processing network a Graph-based Sparse PCA Network (GS-PCANet). The key contributions of this paper are as follows:

  • Feature Extraction Using Graph-Based Sparse PCA: Unlike other histopathology image classification methods, in this work we propose a baseline neural network method called GS-PCANet, which is different from prior methods [bruna13invariant, chan15pcanet, shi17histo, dutta20sparse] in two aspects. 1) We include an additional sparsity promoting term in the PCA transformation so as to select more interpretable features from the images. 2) We include a graph regularization term in the objective function so as to preserve the local structures for each data point between the different classes.

  • Computationally Efficient Approach: Our proposed GS-PCANet is computationally efficient in comparison to other deep learning methods in two aspects. 1) We show that a simple two-stage network is good enough to extract all the relevant features for classifying the tumor versus healthy lung regions. 2) We do not need to learn the filter weights at each stage of the network.

We evaluate the proposed method and seven state-of-the-art algorithms developed for histopathology image classification on a dataset of 67 images provided by the Stefanie Galban Lab, at the University of Michigan. The dataset consists of microscopy images of murine H&E stained lung sections and are divided into two categories: images of non-tumor-bearing control mice and images of mice with visible tumor.

Fig. 2: An outline of the proposed (two-stage) GS-PCANet.

Ii Principal Component Analysis

Let X denote an matrix of rows and columns of rank , where is the number of data samples, and is the number of features/variables. Let denote the element of X at row and column . Assume each column has zero mean. Let denote the covariance matrix of , where is a positive definite matrix of size , which can be decomposed as


where is the

largest eigenvalue of


is its associated eigenvector. PCA reduces the dimensionality of the data from

to by replacing the original features/variables with linear combinations of the form

known as the principal components (PCs), which are obtained by maximizing their variance:


where is the principal loading vector and the projection of the data is the principal component and the operator

denotes the (estimated) variance of a random variable.

Generally, PCA is computed using singular value decomposition (SVD) of

X as


where the columns of are the PCs, and the columns of V are the corresponding principal loading vectors (also known as basis vectors) [malladi20image]. The matrix S is a

diagonal matrix of ordered singular values

and the columns of U and V are orthonormal such that . If X is low rank, it is possible to significantly reduce its dimensionality by using the most significant basis vectors. The projection of the data X upon the first basis vectors gives the PCs.

An alternative formulation for PCA can be derived on the projection framework [chan15pcanet], where the PC loading matrix V also known as the PCA basis (defined as the matrix containing the principal loading vectors) can be estimated by solving the following least squares optimization problem:


where is the Frobenius norm, is a matrix whose columns form an orthonormal basis , and

is an identity matrix of size

. The columns of that minimize (3) are referred to as the PCA basis V. The minimization is solved by formulating it as a least absolute shrinkage and selection operator (LASSO) problem [zou06sparse]. Each principal component is derived from a linear combination of all features, consequently making non-sparse. We use this alternative formulation for PCA feature extraction in this work.

Iii Proposed Method

Based on the PCA methodology, we propose a simple and efficient machine learning method for histopathology image classification. First, we obtain graph-based sparse PCA filters from the training images as the data adaptive convolutional filter bank for the various stages of a convolutional neural network. Then we perform a simple binary quantization (hashing), which serves as a nonlinear stage. Next, we use block-wise histograms of the binary codes obtained from the quantization process to get the output features of the network. Finally, we train a SVM classifier using the output features to obtain the final classification. The proposed GS-PCANet model is shown in Fig. 2, illustrating each of the above steps involved in our algorithm.

Iii-a Graph-Based Sparse PCA

From the analysis of PCA in Section II, we can obtain a sparse PCA basis by including a regularization term in (3). Inclusion of a sparsity penalty reduces the number of features involved in each linear combination for obtaining the PCs. One way to extend (3) to obtain sparse basis vectors is by imposing -norm and -norm penalty constraints upon the regression coefficients (basis vectors) [zou06sparse]:


where the same (the regularization parameter of the -norm) is used for all components, different (the regularization parameters of the -norm) are allowed for penalizing the loadings of different PCs. The corresponds to the required sparse basis . The -norm and -norm regularization terms penalize the number of non-zero coefficients in , whereas the loss term simultaneously minimizes the reconstruction error . If and the are zero, the problem reduces to finding the ordinary PCA basis vectors, equivalent to (3). When some coefficients of are forced to zero, resulting in sparsity.

The sparse PCA defined in (4) preserves the global structures in the data. In addition to preserving the global structures, we are interested in preserving the local structures, i.e., nearest neighbor (NN) preservation of each data point , as they help in identifying local features in the data. We define to be a constructed weighted graph. The vertices of correspond to the data points . The weight matrix is defined as


where the set contains the nearest neighbors to the node in the graph. Furthermore, the -norm is applied to measure the dissimilarity of two data points, and the weight matrix E is used to restrict the similarity between two data points. Thus, with the weight matrix E, we can formulate a graph regularization term as


where C is a diagonal matrix with , L is the graph Laplacian matrix computed as and is the trace of a matrix. Minimizing the graph regularization term in (III-A) ensures that the local structures between the data points are preserved. Combining the sparse PCA from (4) and the graph regularization from (III-A), we propose a graph-based sparse PCA model,


where is a graph regularization parameter. To solve (7), we perform the following steps: first solve an ordinary PCA problem to fix A, then formulate an elastic net with the fixed A and solve for B, then perform SVD to update A, and repeat these steps until convergence, finally obtaining the solution as .

Iii-B Architecture of GS-PCA Network

Suppose there are training images of size , and assume that PCA filter size is (formed by reshaping a basis vector of length ) at all stages of the network. The sparse PCA filters are learned from these training images. We describe each component of the network in detail below (see Fig. 2).

Iii-B1 First Stage (GS-PCA)

For each training image , around each pixel we take an image patch of size and denote all the overlapping image patches in the image as , where denotes the vectorized image patch in , , . We then subtract the image patch mean from each of the image patches and obtain the centralized matrix of as , where and . By constructing a similar centralized matrix for each training image , we obtain


Assuming that we have PCA filters in stage , sparse PCA minimizes the reconstruction error within a family of orthonormal filters using (7), where is an identity matrix of size . The solution to the minimization problem in (7) are the principal eigenvectors of [chan15pcanet]. The PCA filters can therefore be expressed as


where is an operator that reshapes a column vector to a matrix and denotes the principal eigenvector of . The principal eigenvectors capture the main variation of the centralized image patches in the training data. Similar to a convolutional neural network we stack multiple stages of the sparse PCA filters to extract higher level features.

Iii-B2 Second Stage (GS-PCA)

We repeat the same process as in first stage. Let the filter output of first stage be


where denotes 2D convolution and boundary of the images

are zero padded before convolution. Similar to the first stage we collect all the overlapping image patches of the convolved image

, subtract the patch mean from each patch and obtain the centralized matrix , where is the mean subtracted image patch in . We define as the matrix containing all the mean subtracted patches of the filter output and concatenate for all filter outputs as


Once again we solve (7) with Y as the input. The solution to the minimization problem in (7) are the principal eigenvectors of . The sparse PCA filters of the second stage are then obtained as


For each input image of the second stage, there will be output images of size generated as


After the second stage we will obtain output images. It is easy to repeat the above process to build more (sparse PCA) stages if a deeper architecture is needed.

Iii-B3 Binary Quantization (Hashing)

For each of the input images presented to the second stage we obtain real-valued output images

. We binarize these outputs and obtain

, where is a Heaviside step (like) function, which has a value of 1 for positive entries and zero otherwise. Around each pixel, we view the vector of binary bits as a decimal number, thus converting the outputs in into a single integer-valued “image”


which has pixel values in the range .

Iii-B4 Block-wise Histograms

We partition each of the “images” into distinct blocks, compute the histogram (with bins) of the decimal values in each block and concatenate all histograms into a single vector denoting it as . After such an encoding process the “feature” of the input image is then defined to be the set of block-wise histograms, i.e.,


We use overlapping blocks to build the feature vector for each input image as it helps in retaining most amount of the information.

We train a linear support vector machine (SVM) classifier [cortes95support] using the feature vector obtained for each input image from the GS-PCANet in order to classify cancer lesions versus normal tissues on H&E stained histological lung slides.

Iii-C Classifying Color Images

There are several options to extend the proposed GS-PCANet method to be able to extract features for classifying color images. In this work, we follow the approach described in [gurcan09histo, chan15pcanet] and apply the proposed GS-PCANet to each of the red, blue, and green channels to obtain multichannel sparse PCA filters, that are then used to extract features for classifying the color images.

Iv Experiments and Results

In this section we evaluate our proposed GS-PCANet image classification algorithm with other open-source histopathology image classification methods: SpPCANet method for image classification

[dutta20sparse], multiple clustered instance learning (MCIL) for histopathology image classification [xu14weakly], saliency-based dictionary learning (SDL) [sarkar18sdl], analysis-synthesis learning with shared features (ASLF) [li20anal], patch-based convolutional neural network (PCNN) [hou16patch], encoded local projections (ELP) for histopathology image classification [tizhoosh18represent], and weakly supervised deep learning (WSDL) for whole slide tissue classification [campanella19clinical]. We evaluate these seven methods using commonly used detection/classification measures: precision (P), recall (R), detection accuracy, -score, Tanimoto coefficient (T), and the receiver operating characteristic (ROC) curves along with the area under the curve (AUC).

The Precision P and recall R (a.k.a. true positive rate or sensitivity) are given by


where TP is the number of true positive classifications, FP is the number of false positive classifications, and FN is the number of false negative classifications. The false positive rate (a.k.a. complement of specificity) is defined as . An ROC curve is a plot of the true positive rate versus the false positive rate. The detection accuracy is defined as ( .

The -score is defined by


We use (i.e., ) as this is the most common choice for this type of evaluation [ram16size].

Tanimoto coefficient, also known as Tanimoto distance in statistics, is defined as


where M is the number of detected individual tumors by an automated algorithm and N is the actual number of individual tumors in the image.

The AUC is the average of precision over the interval (), where is a function of recall R. It is given by


The best detection algorithm among several alternatives is commonly defined as the one that maximizes the Tanimoto coefficient, AUC, and the -score.

Iv-a Dataset

The proposed method was mainly developed with the goal of identifying individual tumors in H&E stained whole slide histopathology lung images obtained from an inducible K-ras lung cancer model. The images were produced using a digital slide scanner (Super COOLSCAN 5000 ED Digital Slide Scanner; Nikon Corporation) with a objective lens (level- pixel size: ). In our experiments, the size of each image acquired is approximately pixels. Our dataset consists of a total of 67 whole slide histopathology lung images obtained from 32 non-tumor-bearing mice and 35 mice with visible tumors. A careful manual delineation of the borders of the individual tumors within the 35 images was performed by an expert and considered as ground truth for subsequent analysis. We divide each image in our dataset into non-overlapping image patches of size pixels consisting of a total of 52,487 cancer lesion patches and 1,455,023 normal patches.

Iv-B Experimental Setup

We used a total of 15 non-tumor-bearing mice images and 15 images with visible tumors for training the compared algorithms, consisting of a total of 21,934 cancer lesion patches and 653,092 normal patches. Our test dataset consists of 17 non-tumor-bearing mice images and 20 images with visible tumors consisting of a total of 30,553 cancer lesion patches and 801,931 normal patches. The hyper-parameters of the GS-PCANet algorithm include the filter size (), the number of stages, the number of filters in each stage (), and the block size for the local histograms in the output stage. The optimal values for these parameters were automatically selected on a validation set (randomly chosen from within the training data), using the ROC curves by varying one parameter at a time while keeping the others fixed and choosing that value of the parameter that maximizes the AUC of the ROC curve. The parameters of the GS-PCANet were set to , , , and, a histogram block size of .

Fig. 3: Detection results on a representative image containing visible tumors in our test dataset using: (a) GS-PCANet, (b) SpPCANet, (c) MCIL, (d) SDL, (e) ASLF, (f) PCNN, (g) ELP, and (h) WSDL. The true borders delineated by an expert of each individual tumor in the image are shown in blue, the true positives patches identified by each method are shown in green and the false positives of each method are bordered in red in the color version of this paper. False negatives are those regions within the blue-bordered individual tumors that are not shaded in green. Results on the entire image are shown in row 1, and two zoomed regions are shown zoomed in rows 2 and 3.
Fig. 4: ROC curve of image patch classification as cancerous or healthy for different methods.

Iv-C Qualitative Results

Fig. 3 shows the qualitative detection results for an example image containing visible tumors from our test dataset. Fig. 3(a) shows that the proposed GS-PCANet method detects most of tumor regions correctly with very few false positives and false negatives. Fig. 3(d) shows that the ASLF method is also able to identify the tumor regions well, but detects more false positives than the GS-PCANet method. The SpPCANet, MCIL, and WSDL methods have many misclassifications (with blood vessels being identified as tumors) as shown in Figs. 3(b), (c) and (g), respectively. The ELP method splits a single tumor into three tumors (see Fig. 3(g) row 3), with many false positives. The SDL, PCNN, and ELP methods miss large parts of individual tumors, i.e., have many false negatives as shown in Fig. 3(d), (f), and (g), respectively. Visually it is clear that the proposed GS-PCANet method accurately detects both large and small individual tumors within the whole slide image with very few false positives and false negatives. This is of great significance for those studying oncogenesis, progression, and metastasis because the robustness of the algorithm to the size of the tumor reduces the likelihood that the algorithm will mislabel cases containing only small tumors.

Method Precision (P) Recall (R) -score Tanimoto Coefficient (T) Detection Accuracy AUC
GS-PCANet 0.872   (0.013) 0.955   (0.019) 0.912   (0.015) 0.903   (0.010) 0.908   (0.008) 0.951 0.011
SpPCANet [dutta20sparse] 0.841   (0.019) 0.870   (0.025) 0.855   (0.022) 0.836   (0.014) 0.853   (0.015) 0.907 0.017
MCIL [xu14weakly] 0.719   (0.022) 0.780   (0.015) 0.748   (0.031) 0.762   (0.019) 0.738   (0.026) 0.821 0.013
SDL [sarkar18sdl] 0.752   (0.024) 0.850   (0.031) 0.798   (0.025) 0.801   (0.017) 0.785   (0.011) 0.849 0.021
ASLF [li20anal] 0.811   (0.028) 0.900   (0.019) 0.853   (0.021) 0.829   (0.030) 0.845   (0.018) 0.903 0.022
PCNN [hou16patch] 0.807   (0.039) 0.815   (0.031) 0.811   (0.032) 0.796   (0.023) 0.810   (0.024) 0.871 0.039
ELP [tizhoosh18represent] 0.761   (0.023) 0.750   (0.018) 0.756   (0.021) 0.739   (0.027) 0.758   (0.023) 0.844 0.014
WSDL [campanella19clinical] 0.798   (0.030) 0.785   (0.028) 0.823   (0.031) 0.821   (0.035) 0.818   (0.028) 0.882 0.041

Mean Performance (and Standard Deviation) for Various Algorithms

Iv-D Quantitative Results

We compared the quantitative performance of the automated methods at the image patch level and for the task of individual tumor detection within an entire image as well. Fig. 4 shows the ROC curves of all automated methods at the image patch level on the test dataset. From Fig. 4, we observe that our proposed GS-PCANet method exhibits the most favorable trade-off in terms of accurate detection while maintaining low false positive rate in comparison to the other automated methods. Table I shows the quantitative performance of the compared methods for the task of individual tumor detection within the histopathology images in the test dataset. Table I shows that the detection accuracy of the proposed GS-PCANet method is much higher than the other competing algorithms. From Table I, we also observe that the -score, and Tanimoto coefficient (T) of the proposed method are the highest among the compared algorithms. Table I

also provides the AUC values and their 95% confidence intervals corresponding to the ROC curves in Fig. 

4 for each method. We observe from the AUC values that the GS-PCANet method outperforms the alternatives. In addition to the metrics in Table I

, we also computed the free receiver operating characteristics curves (FROC)

[ram16size] for all the compared algorithms. Fig. 5 shows that the proposed GS-PCANet method has better detection accuracy compared to the other automated methods at all points along the FROC curve. This shows that the proposed method detects the individual tumors within these images better than the other compared methods.

Fig. 5: FROC curve of different methods for the individual tumor detection task within an entire image.

The confusion matrix corresponding to competing methods for our test dataset is provided in Table 

II. From Table II, we observe that our proposed GS-PCANet method outperforms competing dictionary learning methods as well as the deep learning methods. This success is attributed to the ability of our proposed GS-PCANet method to capture both the local and the global features associated with both normal and cancerous regions within the images, which the other compared methods do not address.

Fig. 6: Selection bias plot showing the distribution of detection accuracy over ten different training choices of image patches for the compared methods.
Class Cancerous Healthy Method
87.21 12.79 GS-PCANet
84.06 15.94 SpPCANet [dutta20sparse]
71.89 28.11 MCIL [xu14weakly]
75.22 24.78 SDL [sarkar18sdl]
Cancerous 81.08 18.92 ASLF [li20anal]
80.69 19.31 PCNN [hou16patch]
76.14 23.86 ELP [tizhoosh18represent]
79.81 20.19 WSDL [campanella19clinical]
04.97 95.03 GS-PCANet
13.47 86.53 SpPCANet [dutta20sparse]
24.04 75.96 MCIL [xu14weakly]
17.24 82.76 SDL [sarkar18sdl]
Healthy 11.24 88.76 ASLF [li20anal]
18.69 81.31 PCNN [hou16patch]
24.63 73.37 ELP [tizhoosh18represent]
16.04 83.96 WSDL [campanella19clinical]
TABLE II: Confusion Matrix (%)

Iv-E Statistical Analysis

To investigate the robustness of training or selection bias for each automated method, we obtain the detection performance for 10 different choices of training image patches (the number of training images were fixed), using the rest of the image patches as test image patches. The detection accuracy for each training run was fit to a Gaussian probability density function (pdf) and plotted in Fig. 

6. From Fig. 6, we observe that the mean our proposed GS-PCANet curve is much higher than the competing methods indicating superior average detection accuracy. Even more crucial is the spread/variance of our GS-PCANet curve is smaller than its alternatives indicating highly desirable robustness to the particular choice of training image patches.

Fig. 7: Comparison of the proposed GS-PCANet method and other state-of-the-art alternatives by a two-way ANOVA. Values reported by ANOVA (using MATLAB function anova2) across the methods are , indicating that the improved accuracy of the proposed GS-PCANet method is statistically significant. The intervals shown represent 95% confidence intervals of the detection accuracies for the proposed method (blue) and the competing methods (red).

We also performed a balanced two-way analysis of variance (ANOVA) [hogg87engine] on the detection accuracies in the selection-bias experiment for all the methods. Fig. 7 shows these comparisons using a post-hoc Tukey range test [hogg87engine]. Fig. 7 shows that the performance of the GS-PCANet method is significantly separated from its competing alternatives. -values of the proposed GS-PCANet method compared with other state-of-the-art methods are observed to be much less than , emphasizing the fact that the GS-PCANet method is more effective.

Iv-F Computational Complexity

Here we show computational complexity of the GS-PCANet method by considering a two stage network. For each stage in the GS-PCANet, forming the mean subtracted image patch matrix X has a computational complexity of ; the inner product in (9) has a complexity of ; the computational complexity of the eigen decomposition with graph-regularization is . The sparse PCA filter convolution has a complexity of at stage . The block-wise histogram computation has a complexity of . With , , and assuming , the overall complexity of GS-PCANet is


The computational complexity in (20) applies to both the training and testing phase of GS-PCANet because the extra computation burden during training is the eigen decomposition, which can be ignored when .

Method Training Time (HH:MM:SS) Run Time (Std. Dev.) in Sec.
GS-PCANet 00:21:09 11.14 (3.09)
SpPCANet [dutta20sparse] 00:20:53 15.21 (1.41)
MCIL [xu14weakly] 18:25:06 66.35 (14.36)
SDL [sarkar18sdl] 01:22:41 46.11 (4.51)
ASLF [li20anal] 01:49:27 19.39 (5.15)
PCNN [hou16patch] 19:27:55 39.47 (15.22)
ELP [tizhoosh18represent] 04:38:03 71.44 (9.40)
WSDL [campanella19clinical] 21:44:17 10.31 (6.02)
TABLE III: Mean Run Time (and Standard Deviation)
Fig. 8: Detection accuracy as a function of the number of training images for the competing methods.

We compared the mean inference run time, namely, the time required to classify all the image patches in a single test image for each of the competing algorithms. Table III shows the mean and standard deviation of the run time each method takes to classify an entire image. From Table III, we observe that the proposed GS-PCANet method runs 0.83 seconds slower than the WSDL method, but is on average faster than all the other methods. The SDL and ASLF methods classify the test image patch by reconstructing them from the learned dictionaries and thus take more time to execute at test time. The ELP algorithm finds the Radon transformation of each test image patch at various orientations, thereby taking more time to classify each test image patch. The MCIL method integrates the clustering of multiple subtypes of a single class into the MIL classification framework, thus requiring more run time compared to the other methods. In Table III we also report the training time required to train each of the competing algorithms. From Table III, we observe that the proposed GS-PCANet method and the SpPCANet method take roughly about 21 minutes to train, where as the other methods take about 3 to 62 times more time to train a good model. The small training time of the GS-PCANet method is attributed to the low computational complexity of the method.

Iv-G Impact on Number of Training Images

In this section, we show the practicality and applicability of the proposed GS-PCANet method in medical imaging tasks where we have very few data to learn from. Whereas in all other experiments we trained on 15 images each, from both classes, in this experiment we varied the number of training images (from 1 to 20) for all the competing methods and computed detection accuracy of these methods. Fig. 8 shows the detection accuracy of all the competing algorithms on the test dataset of 27 images (12 non-tumor images and 15 images with visible tumors). From Fig. 8, we observe that the proposed GS-PCANet method trained with as few as 8 images achieves a high detection accuracy of 91%, whereas the other methods are able to achieve a maximum detection accuracy of only about 89% and also require as much as 20 training images. This shows that the proposed GS-PCANet method can produce a good model for image classification with less training data.

V Discussion and Conclusion

Tumor burden in histopathological sections is difficult to assess by manual evaluation, as well as by prior automated tumor detection algorithms. To solve this problem, our proposed machine learning algorithm uses a cascaded graph-based sparse PCA transform followed by PCA binary hashing and block-wise histograms to obtain features within image patches. These features are then used to classify an image patch as cancerous or healthy using a linear SVM classifier. Our approach differs from earlier learning-based methods based on deep learning [hou16patch, campanella19clinical], instance learning [xu14weakly, tizhoosh18represent] or dictionary learning [sarkar18sdl, li20anal] for histopathology image classification. Like many deep learning methods, the network parameters, such as the number of stages, the filter size, and the number of filters, need to be optimized and fixed for our GS-PCANet method. Once these parameters are fixed, training the GS-PCANet is extremely simple and efficient because the filter learning in GS-PCANet does not require regularized parameters or require numerical optimization solvers. Moreover, the GS-PCANet consists of only linear operations at each stage with a non-linearity applied only at the output stage, which makes the method more interpretable than other deep learning methodologies.

The GS-PCANet method was first validated with respect to detection accuracy using ROC curves and the AUC of the ROC curve. Second, the algorithm was validated with respect to detection accuracy using the precision, recall, -score, Tanimoto coefficient, FROC curves, and the confusion matrix. Tables I & II show that the proposed GS-PCANet method performs the best among the compared methods for histopathology image classification. Fig. 3 shows that the proposed GS-PCANet method qualitatively performs the best in comparison to the other methods. Further, Fig. 6 shows that the GS-PCANet method has superior average detection accuracy and is more robust to the choice of training images compared to the other methods. We also show the low computational complexity of the GS-PCANet method and compare the training and inference run times for all the methods. Table III shows that the GS-PCANet method is relatively very fast to learn a good model in comparison to other methods. Finally, Fig. 8 shows that the proposed method requires less data to learn a good model.

Fig. 9: Example of detection errors produced by our algorithm on an image with visible tumors. The true borders delineated by an expert of each individual tumor in the image are shown in blue, the true positive and the false positive image patches are shown in green and red, respectively, in the color version of this paper (the image is better viewed in zoomed mode).

Next, we present some inherent limitations of the automated methods for tumor detection. Fig. 9 shows an example case of an image containing individual tumors where all algorithms including our algorithm fail to produce optimum detection results. In Fig. 9 we observe that even though the algorithm has detected all the individual tumors, i.e., the true positive image patches shown in green color, it has also detected many false positive image patches shown in red color. On close examination, we see that the false positive image patches within the image look very similar to cancerous image patches. This could be due to the fact that there is not enough resolution in this image to differentiate between the cancerous and healthy image patches, or this histopathology section was captured when some of the underlying cells were transitioning from healthy to being cancerous.

The proposed detection algorithm uses all the image patches in the training data for obtaining the local structures within the data when computing the graph-based term in (III-A) and (7

). This adds to the time complexity and results in noise and outlier image patches still being included. However, the algorithm can be modified by linearly clustering the image patches into subgroups and taking these cluster centers to compute the graph term in (

7). Making this change could further reduce detection errors and also accelerate the algorithm, making it more accurate and efficient at the same time.