A Whole Slide Image Grading Benchmark and Tissue Classification for Cervical Cancer Precursor Lesions with Inter-Observer Variability

by   Abdulkadir Albayrak, et al.

The cervical cancer developing from the precancerous lesions caused by the Human Papilloma Virus (HPV) has been one of the preventable cancers with the help of periodic screening. There are two types of grading conventions widely accepted among pathologists. On the other hand, inter-observer variability is an important issue for final diagnosis. In this paper, a whole-slide image grading benchmark for cervical cancer precursor lesions is introduced. The papillae of the cervical epithelium and overlapping cell problems are handled and a tissue classification method with a novel morphological feature exploiting the relative orientation between the BM and the major axis of all nuclei is developed and its performance is evaluated. Besides, the inter-observer variability is also revealed by a thorough comparison among pathologists' decisions, as well as, the final diagnosis.



There are no comments yet.


page 1

page 4

page 6

page 7

page 8

page 9

page 11

page 12


Beyond Visual Image: Automated Diagnosis of Pigmented Skin Lesions Combining Clinical Image Features with Patient Data

kin cancer is considered one of the most common type of cancer in severa...

Automated detection of oral pre-cancerous tongue lesions using deep learning for early diagnosis of oral cavity cancer

Discovering oral cavity cancer (OCC) at an early stage is an effective w...

Patch-Based Cervical Cancer Segmentation using Distance from Boundary of Tissue

Pathological diagnosis is used for examining cancer in detail, and its a...

Differences between human and machine perception in medical diagnosis

Deep neural networks (DNNs) show promise in image-based medical diagnosi...

Comparative study of image registration techniques for bladder video-endoscopy

Bladder cancer is widely spread in the world. Many adequate diagnosis te...

A Pathology-Based Machine Learning Method to Assist in Epithelial Dysplasia Diagnosis

The Epithelial Dysplasia (ED) is a tissue alteration commonly present in...

Preprocessing for Automating Early Detection of Cervical Cancer

Uterine Cervical Cancer is one of the most common forms of cancer in wom...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.


Cervical cancer is one of the most commonly seen cancer type in the world and the 4th most common cause of death, which develops from precursor lesions [1]. Studies indicate that, almost all cervical cancer cases develop by the effect of Human Papilloma Virus (HPV), which reaches epithelial basal layer cells with the help of micro-injuries in the cervical epithelium. Carcinogenic effect of the virus occurs when HPV’s genome integrates with the cell genome [2, 3, 4]. This effect, which requires a certain period of time, appears as morphological changes in the cervical epithelium. These precancerous lesions characterized by dysplastic changes are called squamous intraepithelial lesions (SIL).

Impact of HPV on the cervical epithelium varies throughout the life cycle of the virus, which in turn results in different morphological changes. Early diagnosis can be made possible by the analysis of these morphological structures [1, 5]. After being infected by HPV, basal cells proliferate and the epithelium loses its maturation. As well as the loss of maturation which results in polarity loss in the epithelium, cells show nuclear enlargement, nuclear irregularity, and hyperchromasia. Depending on the proliferation process, the number of mitoses also increases. The effect of viral proteins on the cyto-skeleton reveals halo cells with characteristic perinuclear halo named “koilocytes” (it means "hollow" in Greek). These dysplastic changes are graded according to whether they are seen among the lower, middle and upper part of the epithelium. They are reported according to Cervical Intraepithelial Neoplasia (CIN) 1-3 in the CIN-based grading and LSIL (low-grade squamous intraepithelial lesion) and HSIL (high-grade squamous intraepithelial lesion) in the SIL-based grading [6, 7, 8]. Currently, the use of SIL-based grading is recommended, yet the CIN-based grading is also used.

Pathologic diagnosis of cervical biopsies varies depending on biopsy ingestion or artifacts due to laboratory steps, and pathological interpretation. Due to the spread of women health screening programs, the diagnosis of cervical biopsies is frequently encountered and this diagnosis variability has become a more important problem. Cervical biopsy interpretation has inter- or intra-observer variability, which means that a biopsy may have different diagnoses by different pathologists or by the same pathologist at different times, and it is accepted to an extent in the literature [9, 10, 11]. Studies have been made to overcome this problem with classification systems suitable for the nature of HPV or with the help of immunohistochemical techniques [12, 2, 7].

Increasing role of the information technology (IT) on the area of medicine has a positive impact on the pathology. Digital pathology that includes diagnosis, education, consultation, archiving, and also morphometric evaluation tools [13, 14]. Studies about morphometric analysis are available for different tissues and systems [15], as well as for cervical lesions [16, 17, 18, 19] in the literature. De et al. [16]

studied image analysis methods on 62 digital images of cervical epithelial with Normal, CIN1, CIN2 and CIN3 labeled lesions. The cervical regions are manually marked by the pathologist on selected epithelial images, and these regions are divided into vertical segments by calculating the medial axis. The obtained epithelial segments are examined in terms of structural, geometric, and profile-based properties. Contrast, energy levels, pixel correlation values, and neighborhood features of the pixels within the vertical segment are studied as the structural features. Geometric features include the distances between nuclei centers and Delaunay triangulation. In the profile-based feature extraction, correlation values and the brightness values of all pixels of each row of vertical segment is calculated. Linear Discriminant Analysis (LDA) and Support Vector Machines (SVM) are utilized to classify feature vectors of vertical segments. First, each of the vertical segments is classified individually, then these decisions are fused to obtain a whole epithelium classification result. The effect of individual decisions of vertical segments on the whole epithelium classification result is also examined. One-to-one correspondence between the system result and pathological diagnosis is named as "Exact Class Label" (1st approach), only one class difference between system result and pathological diagnosis is named as "Windowed Class Label" (2nd approach) and bigger differences between system result and pathological diagnosis is named as Normal versus CIN (3rd approach). Different classification performances are calculated using different approaches and features. Using all the structural, geometric and profile-based features, recognition rate of 62.3% on vertical segments and 39.3% on whole epithelium is reached.

Guo et al. [17] tried to develop enhanced image analysis methods on the cervix image data set that was formed in De et al. [16]. They have increased the success of classification by adding structural features of the nucleus and cytoplasm in addition to the features extracted from similar vertical segments as in De et al. [16]. These features consist of nucleus, cytoplasm, and acellular areas and ratios, color scale (red, green, blue) brightness values, numbers of triangles obtained by Delaunay triangulation at upper, middle, and lower epithelium regions. The features are classified by the classification methods of the previous study. The name "Windowed Class Label" used in the second approach in the previous study is changed to "Of-By-One Class Label". In this study, as well as using the same data set as De et al., they made a difference of examination by 2 different pathologists. The diagnostic success rates of the extracted features are determined by the Attribute Information Gain Ratio (AIGR) algorithm. They evaluated the success of features according to two different classification approaches. As a result of adding structural features of cervical regions, they have increased their classification success up to 82-88.5%.

Figure 1: Processing steps followed in the proposed study. The first step describes the pathological preprocess which is handled in the pathology laboratory. Whole slide scanning and the filing process mentioned in the second row is done by the medical researcher and the computer scientist in collaboration. CADAS framework developed for grading the cervical cancer pre-cursor lesion is mentioned in the third row.

Wang et al. studied morphometric analysis methods on 31 digital images of cervical biopsies [19]. Their study consists of two steps as the automated segmentation of squamous epithelium and the CIN classification. In the first step, the epithelium is segmented using the difference of the visual properties of five different regions consisting of squamous epithelium, columnar epithelium, stroma, background and erythrocytes. The medial axis is drawn parallel to the basal and upper membrane borders after the epithelial region is segmented. Square windows with pixel dimensions are created on normal lines of medial axis. The features of the nucleus average area and number, the average area of the triangles obtained by Delaunay triangulation, and the average edge length are analyzed within each window. Obtained feature vectors are fed to different classification methods, and have reached accuracy rates ranging from 60% to 95%.

Keenan et al. proposed a study to analyze 230 digital cervix images consisting of normal, koilocytosis, CIN1, CIN2, and CIN3 labeled lesions [20]. The features of the nucleus area, the nucleus cytoplasm ratio, the ratio of nucleus area to cytoplasm, and the edges / areas of the Delaunay triangles are analyzed. The Kappa value for the observer difference between the two pathologists involved in the study is 0.415. The classification performance of the system in distinguishing normal and CIN lesions is 98.7%. The overall success rate is 62.3% where the worst performance is achieved on CIN2 labeled patterns.

Nagdhy et al. proposed a study to classify a total of 475 cervical biopsies with normal, CIN1, CIN2, CIN3, and invasive carcinoma using three different methods [18]

. The nucleus area, core cytoplasm ratio, core boundary irregularity, and areas of Delaunay triangles are analyzed. They reached up to 97% with respect to specificity and 100% with respect to sensitivity using different methods including Gabor-based texture descriptor, GLCM (Gray-level co-occurrence matrix) texture descriptor, and pre-trained convolutional neural network.

In this study, morphometric analysis methods for the cervical SIL diagnosis is investigated on a new digital cervical image data set. The numerical values of the morphological features used by the pathologists in the diagnosis are extracted. The statistical significance of their contribution to diagnosis is examined. Within the scope of the study, a Computer Aided Diagnostic Auxiliary System (CADAS) is developed and its performance is evaluated.

Contributions of this study are as follows:

  • A new whole slide image grading benchmark for grading of cervical dysplasias is created and introduced to histopathological image analysis community.

  • Images obtained from the data set are labeled by two pathologists to mention the inter-observer variability in cervical dysplasia grading.

  • Pathologists diagnosed each image patch stained with hematoxylin eosin (HE) in the data set independently. In the likely case of inconsistent diagnoses, the image patches that are stained with p16 and Ki67 immunhistochemical dyes are analyzed to decide a final diagnosis.

  • A morphometric analysis method for cervical SIL diagnosis is proposed.

  • The presence of papillaries in the dataset that leads to tangential sections is one of the important parameters that pathologists give account for when diagnosing.

1 Materials and Methods

This study is conducted by a group of scientists and medical researchers. Cervical tissue slide samples with diagnosis results are collected in the pathology laboratory of Istanbul Medipol University (IMU) Hospital, Istanbul, Turkey. Fig. 1 shows the processing steps followed in this study for the proposed CADAS to grade cervical cancer precursor lesions.

1.1 Data Collection and Image Acquisition

Within the scope of the study, 127 high resolution slides from 54 patients are scanned at the pathology laboratory of IMU Hospital. Fig. 2 represents the whole slide images obtained from the data set. The images stained with H E, Ki67, and p16 immunhistochemical dyes. All high-resolution images are then divided into 957 small epithelium pieces by the pathologist. Each slide in the data set is diagnosed after splitting into smaller epithelial pieces. Totally, 957 epithelial pieces are obtained from the whole slides. 471 of the 957 images are diagnosed as normal, 240 of them are diagnosed as CIN1, 107 of them are diagnosed as CIN2 LSIL, 57 of them are diagnosed as CIN3.

The images of the hematoxylin and eosin (HE), p16 and Ki67 preparations are acquired by an off-the-shelf whole slide scanner (See Fig.3). The scanner has a capability of up to optical and digital zoom. The whole slide images obtained with the high-resolution scanner are transferred to the digital platform to be processed by several image processing techniques and also to be interpreted by the expert pathologists. The images are saved in TIFF format without any loss. The size of the images obtained by the scanner are varied from to and there are more than one diagnosis in a single lesion.

Figure 2: Image samples obtained from the data set. The sub-figures of (a) and (d) represent the whole slide images which are stained with HE; the sub-figures of (b) and (e) represent the same images stained with Ki67 immunohistochemical dye and the sub-figures (c) and (f) shows the same images obtained with p16 immunohistochemical dye.
Figure 3: Image acquisition system: Tissue slides are scanned using a high resolution scanner. The scanned slides are then transferred to a server to store the images in a file system.

1.2 Ethics Statements

Authors confirm that all samples taken from patients were prepared in accordance with the legislation prepared by the Ministry of Health of Turkey and in accordance with international agreements & European Union standards. All experimental protocols were approved by the Istanbul Medipol University’s licensing committee. Informed consent was obtained from all subjects whose tissue samples were used in experiments. In tissue sample collection for data set, there were no subjects under 18.

1.3 Annotation and Image Labeling

A graphical user program is developed within the scope of this study for pathologists to mark/label the basal membranes (BM) and papillae of the cervical epithelium. After marking the membranes and the papillae, the program extract the hot spot region from the background. Fig. 4 represents an original input image taken from the data set and a clean image after marking the coordinates of the epithelium. Further image processing and analysis algorithms use this clean image.

At first, two pathologists made the diagnosis independently for each small epithelial piece (SEP) image patch. A final diagnosis is then made by observing the same lesions stained with p16 and Ki67 immunohistochemical dyes in case of disagreement. According to the final diagnosis, 471 of SEP (%49.2) are labeled as normal, 240 of them (%25.1) are CIN1, 107 of them (%11.2) are CIN2 and 139 of them (%14.5) are CIN3. However, 150 of large epithelial piece (LEP) (%46.9) are labeled as normal, 79 of LEP (%24.7) are CIN1, 34 of LEP (%10.6) are CIN2 and 57 of LEP (%17.8) are CIN3 (see Table 1). Diagnostic distributions of the SIL-based grading are shown in Table 2. Similarly, distribution of final diagnosis in SIL-based grading are as follows: 471 of SEP (%49.2) are normal, 240 of SEP (%25.1) are LSIL and 246 of SEP (%25.7) are HSIL. Similarly, 150 of LEP (%46.9) are normal, 79 of LEP (%24.7) are LSIL, 91 of LEP (%28.4) are HSIL.

Normal CIN1 CIN2 CIN3 Total
SEP 471 240 107 139 957
LEP 150 79 34 57 320
Table 1: Number of epithelium pieces in each class depending on the CIN-based grading
Normal LSIL HSIL Total
SEP 471 240 246 957
LEP 150 79 91 320
Table 2: Number of epithelium pieces in each class depending on the SIL-based grading
(a) Sample image
(b) Hot spot (or Region of interest)
Figure 4: Annotation and hot spot region extraction. a) Input image obtained from the data set, b) extracted hot spot cervix region for further analysis. The red and green lines drawn around the lesion represents the BM and the upper membrane (UM), respectively.

1.3.1 Inter-Observer Variability

Interpretations of morphologic changes representing dysplasia may differ between pyhsicians or for the same physician in different time intervals. This variety can be interpreted as Inter/Intra-observer agreement / disagreement. Artifacts associated with the biopsy procedure and tangential sections in the microscopic examination are also effective on this variety. Inter- and intra-observer agreement rates are in the range of 0.20 and 0.47 in the literature [10, 11]. Regarding to CIN-based grading, SIL-based grading provides higher inter-observer and intra-observer agreement rates. The highest diagnosis diversity is reported between the groups of CIN2, while the lowest is CIN3. The disagreement rates are smaller between normal and CIN1 groups. McCluggage et  al.  reported weak inter-observer agreement in the CIN-based grading with Kappa value of 0.2. Although the compatibility rates are reported low, Kappa value is found as 0.3 in the SIL-based grading. Failure to achieve the expected high agreement rates is interpreted by the pathologists involved in the study not being familiar with SIL-based grading. The same test is repeated with the observers who have been experienced to use the SIL-based grading for six months more and new Kappa values are calculated as 0.33 (intra-observer) and 0.47 (inter-observer). Galgano et al. tried to maximize the agreement rates between the observers with P16 and Ki67 immunohistochemical methods (54). The Kappa value is found to be 0.68 by immunohistochemistry examination while standard hematoxylin and eosin (HE) detection has the kappa value of 0.47. In the study, it is stated that the low agreement rates associated with diagnostic differences can be increased by using SIL-based grading rather CIN-based grading, or utilizing some immunohistochemical methods aiding diagnosis.

1.3.2 Final Diagnosis

Immunohistochemical examinations are used as an assistive method to obtain the diagnosis in case the morphological features are not clearly interpreted. P16, Ki67 and ProExC are the most widely used immunohistochemical studies for cervical precursor lesions [21, 22, 23, 24, 25]. Staining pattern with p16 is important in immunohistochemical evaluation, and block-like and strong staining demonstrates HrHPV association, with at least 1/3 of the epithelium (7,56,58). Ki67 is an indicator of proliferation. Positivity may also be seen in other proliferating cells such as inflammatory cells as in keratinocytes. For this reason, it must be interpreted carefully in the presence of inflammation. ProExC is similar to Ki67 in terms of being a proliferation indication and its staining type. P16 and Ki67 are frequently used in routine practice. HrHPV-associated lesions show strong “nuclear” or “nuclear and cytoplasmic”, block-like staining with P16. The squamous metaplasia, atrophy, reactive regenerative changes that appear in the SIL discriminator pattern show a negative staining pattern. While Ki67 normally stains parabasal cells, positivity is also observed in higher epithelial sections in relation to the grade of dysplasia in SIL. Reactive cells and inflammatory cells may also exhibit immunoreactivity, it should be very careful when interpreting these tissues.

1.4 Morphometric Feature Extraction and Tissue Classification

In this study, a morphological analysis based feature extraction method is used for the grading of cervical cancer precursor lesions. The processing steps followed in the study is represented in Fig. 1. The first row of the diagram describes the pathological pre-process which is handled in the pathology laboratory. Whole slide scanning and the filing process mentioned in the second row is done by the medical researcher and the computer scientist in collaboration. This section describes the CADAS framework which is mentioned in the third row.

1.4.1 Creating the Small Epithelial Pieces (SEP)

In Fig. 5, the red line corresponds to the coordinate information of the BM, while the green line corresponds to the coordinate information in the upper membrane. Determining the basal and upper membrane coordinates allows to know in which region of the epithelium the cells are located. After the region of interest (squamos epthitelium) has been obtained, an interface developed within the scope of the study is used to divide the whole epithelium to SEP which can be assumed equal in length (see Fig. 6).

The image patches which are analyzed in this study are represented in Fig. 6. The coordinates of basal and upper membranes of the epithelium are marked by the pathology experts with the use of a graphical interface. Coordinates data information of the papillae which represented with yellow line are also stored in separated files.

Figure 5: A high resolution histopathological image example obtained from the data set and the SEP cropped from that image.

1.4.2 Obtaining Cells by Simple Linear Iterative Clustering(SLIC) Superpixels Segmentation Algorithm

After small epithelial pieces are obtained, the high resolution histopathological images are ready for further analysis. First, a median filter of dimension is applied to the image to remove the artifacts without effecting the boundaries. Then, cellular structures have been obtained by simple linear iterative clustering (SLIC) superpixels segmentation algorithm, which is one of the methods that have not been widely used yet in histopathological images in recent years. This method performs the segmentation process based on the color similarities and neighbour relations of the pixels in the image [26]. The grid size is expressed as


where is the number of superpixels for a given input image, and represent the width and height of the given image patch, respectively. The euclidean distance of the related pixel to the superpixel center is


where, represents the value to be clustered and represents the center pixel. Here, and represent the brightness values of red, green, and blue color of the respective pixels. RGB color space is used in this study instead of using Lab color space as mentioned in Achanta et al[26]. The Eq. 3 also represents the distance of the coordinates of each pixel to the related cluster center:

Figure 6: Sample SEP image patch which includes basal, upper membrane and papillae. The grading is done taking this structures into consideration by the pathologists. The red line, green line and yellow circle define the BM, upper membrane and papillae coordinates information, respectively.

where, and are the horizontal and vertical coordinate information of each center pixel, and and values are the coordinate information of each pixel to be clustered:


the value of is the sum of the (x, y) plane distance normalized by the grid interval N and the RGB distance. Here, normalization is done so that the calculation of the coordinate information does not directly affect the brightness interval. The value of is defined to set the compactness of superpixels.

(a) SEP image patch
(b) Overlay of superpixels
(c) The pre-segmented SEP image patch
Figure 7: Implementation of SLIC superpixels segmentation algorithm to a sample image patch a) SEP image patch, b) Overlay of 3000 superpixels on the related image patch, (c) Resulting pre-segmented image obtained after applying SLIC method.
Figure 8: Final segmentation result of a sample SEP image patch (a). given input image (b) segmentation of the image (c) final binary image after post processing.

Input Distance transform of binary image
     Output Cell center matrix

1:procedure Local Maxima Finding
2:     for  do
4:         if   then
6:     return

      Output Multiplexed Coordinates Set

1:procedure Test C3
3:     for  do
4:          : multiplexing operator      
5:     return
Algorithm 1

Center Estimation and Distribution Generation

According to the SLIC method applied in this study, the cellular structures become more compact and can be separated from the background when each obtained SEP image is expressed with 3000 superpixels. A cellular structure in cervical precursor lesion is approximately . A crucial point to note here is that the superpixels’ sizes should not exceed the size of cellular structure. As can be calculated from the Eq.1, choosing at least 2000 superpixels will guarantee most of the superpixels to not exceed the size of a cellular structure. Less number of superpixels cause overlapping cells. It can be quite difficult to distinguish cellular structures, especially those close to the BM. Since the superpixels which represent the cellular structures are darker than the superpixels which represent the background, the elimination of the unwanted pixel groups (fat-like tissue) above a certain threshold level clears the background. At this stage, small artifacts similar to the cellular structures and some inflammation can remain with the cells as a foreground information. These structures can be eliminated with a morphological size operation that can be applied to the binary image after segmentation stage. Final segmentation result of a sample SEP image patch obtained from the data set is shown in Fig. 8.

1.4.3 Handling the Overlapping Cells Problem

Following the morphological operations, overlapping cells are separated. In the literature, the problem of overlapping nuclei in a plurality of nucleus segmentation studies has been encountered. Because the presence of overlapping cell structures significantly reduces the success of CADAS. Solving the problem of overlapping nucleus at this point is very crucial as a significant contribution in this area. In our study, it is observed that after the segmentation process, there are a large number of overlapping nucleus structures, especially around the BM.

Overlapped cells are intensively present on the SEP image patches. The problem of overlapping of cells should be handled in order to obtain the morphological characteristics of the cell nuclei. Binary images segmented by using SLIC algorithm usually consist of small cellular-like noisy parts. These unwanted small pixel groups are eliminated by an automatic method, which clears pixel groups smaller than 50 pixel. Therefore, the circumference of the cells is also quite rough after the segmentation process. As shown in Fig. 9(a), the closing process has been applied to make the binary image more compact. In order to obtain cell centers, the distance transform is applied to get local maxima shown in Fig 9(c). Local maxima are estimated by using Algorithm-1.

The ellipse form is able to model the cell shapes mathematically well. Thereby, Gaussian Mixture Model (GMM) is one of the best candidate for ellipse fitting over cell heaps. GMM is one of the most common algorithms for statistical data modeling

[27]. The main purpose of the algorithm is to express the distributions as the sum of the weighted Gaussian mixtures (see Eq. (5)). and

intend normal distribution which has mean

and covariance matrix and its weight parameter.


Fig. 10 shows an example of overlapping nucleus and how these overlaps are resolved in a small patch of the image obtained from the data set. The basic structure of the algorithm that determines the cell overlapping is based on the determination of local maxima from the distance of the cell centers to the boundaries. Once the cell centers are determined, the distance transformation yields the value of the multiplexing for each pixel. The processing steps applied for center estimation, distribution generation and multiplexing are given in Algorithm-1. and refer to row-wise cell centers matrix, and multiplexed coordinate set, respectively.The parameter obtained from the distance transform indicates how many times the corresponding coordinate will be repeated in the set . Thereby, the distance value of the pixels away from the border are higher, so the amount of these in will be more. Pixel distribution becomes more suitable for GMM. For example, if each pixel is far away from a boundary, the coordinate information of the related pixel is multiplexed. By applying GMM on the obtained multiplexed coordinate distribution, suitable ellipses are obtained for each cell (see Fig. 11).

1.4.4 Obtaining the Morphological Features of Each SEP

(a) Binary Cell Image
(b) Distance Transform
(c) Center Estimation
Figure 9: Cells taken from the tissues are in often overlapped form. For the solution of this problem, it is important that the cell centers are firstly correctly estimated. (a) Binary mask of overlapping cell heaps (b) Distance transform of binary mask (c) Finding local maximums.

After the cell segmentation and elimination of the cell overlap problem, several morphological features of each cell are extracted. Table 3

represents the morphological features extracted for grading each SEP image patch in this study. Average nucelus area (ANA), average cytoplasm area (ACA), nucleus - cytoplasm area ratio (NCR), nucleus perimeter (NP), border irregularity (BI), hyperkromasis index (HI), and polarity loss index (PLI) are the features represented from the first row to the end, respectively. ANA defines the average nucleus area while ACA is the average cytoplasm area. NCR describes the division results of nucleus ratio to the cytoplasm ratio. NP is the average length of the border pixels of nucleus. BI is the divison of the surrounding length of each pixel to the ellipse that fit each nuclei. The value which represent the hyperkromasis of cell is calculated by taking standard deviation of the pixel intensity values of the cellular structure. PLI is determined by calculating the magnitude of each cellular structure to the BM. All the features are extracted for each region shown in Fig.


Figure 10: Separating overlapping cell after the segmentation process.

Since the emphasized morphological features change depending on their distance to the basal and the upper membranes, each image segment is divided into three main regions as represented in Fig. 12. Then, morphological features related to each region are extracted and stored for further analysis in grading the SEP image patch.

Features Description
Average nucleus area (ANA)
the average nucleus area of each region
Average cytoplasm area (ACA)
the region which represents the
subtraction of total nucleus area from
total area of each region
Nucleus-cytoplasm ratio (NCR)
the division result of total nucleus area
to the total the cytoplasm area
in each region
Nucleus perimeter (NP)
the length of the line which surrounds
the nucleus in each region
Border irregularity (BI)
obtained by dividing length of the
uniform ellipses that fit the nucleus
to the circumference of the related nucleus
Hyperkromosis index (HI)
represents the standard deviation value of
the parabasal cells with respect
to the cells of the same lesion
Polarity lossindex (PI)
The average angle between the basal
membrane and the major axis
of all nucleus
Table 3: List of morphological features extracted in the proposed tissue classification method

The data set of feature vectors is imbalanced. Different classifiers have been proposed in the literature for imbalanced data sets [28, 29, 30]

. The Weighted k-Nearest Neigbour (w-kNN) algorithm is one of these. In this study, w-kNN algorithm is preferred because of its fast operation and practical use. It is also another important criterion for selecting that it gives successful results for imbalanced data sets

[31, 32]. The w-knn algorithm looks at the k closest neighbors class as the k-NN algorithm. In addition, for each neighbor, the weight w defined in is assigned to classify according to the weight of the classes. is Euclidean distance function. If neighbour sample is far from query sample , the effect on the classification is weak, and vice versa.

2 Results

The similarities and differences in the diagnoses of SEP image sections in the data set given by the two pathologists with respect to CIN-based grading. Diagonal values refer to the number of images which have the same diagnosis of two pathologist. The agreement ratio of the pathologists is 73% in the classification of the SEP images with respect to CIN-based grading are shown in Table 4.

(a) Uniform Distribution
(b) Normal Distribution
Figure 11: Generation of ellipses based on uniform and normal distribution after the estimation of center locations: The cell population modeled as the normal distribution is more suitable for the GMM algorithm than the uniform distribution.
Pathologist 2
Pathologist 1 Normal CIN1 CIN2 CIN3 Total
Normal 354 37 0 0 391
CIN1 88 156 8 0 252
CIN2 13 52 71 15 151
CIN3 4 9 22 128 163
Total 459 254 101 143 957
Table 4: Agreement/disagreement of the pathologists experts in diagnosis of SEP with respect to CIN-based grading.

The agreement/disagreement of pathologist 1 and pathologist 2 to the final diagnosis of each SEP image patch are presented in Table 5. Final diagnosis is determined according to the disagreement of pathologists for a SEP image patch. The SEP that are not labeled as the same by the pathologists are then observed from the same tissue stained with p16 and Ki67 immunhistochemical dyes.

CIN Grading P1 P2 Proposed
N C1 C2 C3 N C1 C2 C3 N C1 C2 C3
Final Diagnosis N 377 74 18 2 424 44 3 0 386 61 17 7
C1 13 176 37 14 26 193 16 5 116 91 19 14
C2 0 2 87 18 8 17 79 3 13 30 47 17
C3 1 0 9 129 1 0 3 135 4 15 18 102
Tot. 391 252 151 163 459 254 101 143 519 197 101 140
Table 5: Agreement/disagreement of the pathologist 1, pathologist 2 and the proposed method with final diagnosis in diagnosing of SEP with respect to CIN-based grading.

Table 5 represents the agreement between the final diagnosis and two pathologists with respect to CIN-based grading system. It can be observed that pathologist 2 has more compatible diagnosis result than pathologist 1 with final diagnosis. However, pathologist 1 has more consistent diagnosis in CIN2 SEP image patches. An important information to be drawn from the table is that the number of windowed classes (labelling CIN1 instead of Normal tissue; Normal or CIN2 instead of CIN1; CIN1 or CIN3 instead of CIN2 and CIN2 instead of CIN3) is high over.

Another system that pathologists pay attention to while diagnosing tissues is the SIL-based grading system. In this system, the CIN2 grade is assumed to be two level, CIN3-like and CIN1-like. CIN2 lesion which resembles CIN3 and CIN3 are expressed as HSIL; CIN2 which resembles CIN1 and CIN1 are expressed as LSIL. The treatment of precursor lesions of cervical cancer varies according to LSIL and HSIL. Table 6 represents the agreement between the final diagnosis and two pathologists with respect to SIL-based grading system. Pathologist 1 has more accurate results than pathologist 2 in normal and LSIL while pathologist 2 has more accurate result in diagnosing HSIL. If the diagnosis agreement of the pathologists according to the Table 5 and Table 6 are compared, it can be observed that pathologists makes more consistent diagnosis in SIL-based grading system.

Figure 12: (a) Sample image obtained from the data set and (b) layered result of the same image into three section from basal to upper membrane. The distance of each pixel to the basal and upper membrane is calculated by using the coordinate information. The distance from the pixel coordinate to the each membrane indicates its belonging region.
SIL Grading P1 P2
Final Diagnosis N 377 74 20 424 44 3
LSIL 13 176 51 26 193 21
HSIL 1 2 243 9 17 220
Total 391 252 314 459 254 244
Table 6: Agreement/disagreement of the Pathologists with final diagnosis in the diagnosis of SEP with respect to SIL-based grading.

Table 7 represents the agreement between the final diagnosis with the proposed method and the DT method which is the one of the best known algorithms used in diagnosis of cervical cancer grading depending on CIN-based grading system.

CIN Grading Proposed DT
N C1 C2 C3 N C1 C2 C3
Final Diagnosis N 386 61 17 7 471 0 0 0
C1 116 91 19 14 240 0 0 0
C2 13 30 47 17 107 0 0 0
C3 4 15 18 102 139 0 0 0
Total 519 197 101 140 0 0 0 0
Table 7: Agreement/disagreement of the proposed method and Delaunay Triangulation(DT) with final diagnosis in the diagnosis of SEP with respect to CIN-based grading.

Table 7 represents the agreement between the final diagnosis and the proposed method with respect to CIN-based grading system. Normal and CIN3 SEP patches prediction are classified accurately. However, predicting CIN1 and CIN2 SEP patches is lower while comparing with CIN1 and CIN3. The classification accuracy of the proposed method is approximately . If the results obtained from the proposed method is compared with the pathologists, it can be said that the CAD system developed in this study should be improved in order to be used as a secondary decision system to help the pathologist in predicting cervical cancer precursor lesion grade.

Table 8 represents the agreement between the final diagnosis and the proposed method with respect to SIL-based grading system. The classification performance of the proposed study is again less accurate than the pathologist. However, the results obtained from the SIL-based grading system of the proposed method is improved to comparing to the CIN-based grading system.

SIL Grading Proposed
Final Diagnosis N 381 57 33
LSIL 106 91 43
HSIL 16 27 203
Total 503 175 279
Table 8: Agreement/disagreement of the proposed and DT method with final diagnosis in diagnosis of SEP with respect to SIL-based grading.

3 Discussion

In this study, it is aimed to translate the evaluations of pathologists which have subjectivity on cervical dysplasia to into numerical values and to develop a ‘Computer Assisted Diagnostic Auxiliary Systems (CADAS)’ as a result. Our study on cervical dysplasia has the largest data set according to the similar studies available in the literature. Furthermore, the fact that the diagnoses are given by two pathologists, and the reassessment and determination of the definitive diagnosis during inconsistent cases increased the reliability of the CADAS training set. The developed CADAS promises to be used as an assistant system in the future because of numerical values that are found to be in parallel with the diagnostic parameters used by the pathologists (such as ratio of nucleus to cytoplasm, nucleus boundary irregularity, polarity loss and hyperchromaticity) and statistically significant. The studies in the literature are mostly designed by engineers and the contribution of pathologists is very limited. For this reason, there are some shortcomings when viewed from the perspective of pathology and clinical approach. In the development of a CADAS to be used in pathology, the presence of pathologists at every step is a necessary requirement.

4 Conclusion

In this paper, we present a new benchmark data set of cervical cancer precursor lesions, which we make available to the scientific community for grading the cervical intraepithelial neoplasia. Each image in the data set is labeled by two pathologist experts to reveal the inter-observer variability. In case of different diagnoses, p16 and Ki67 immunohistochemical dyes are used to decide a final diagnosis (ground truth). There are also papilla areas that seriously affect the performance of automated methods which makes this study unique as far as we know. A morphological analysis based feature extraction method is also proposed in the study for the grading of cervical cancer precursor lesions. The result of the study is also compared with each pathologist expert and the ground truth. The results show that CAD systems could be used as a secondary decision system for experts with some improvement. It is aimed to improve the classification performance of our CAD system by developing up-to-date image processing and machine learning algorithms especially types of deep learning.

5 Data Availability

The materials and datasets generated during and/or analysed during the current study are available from the http://simplab.yildiz. edu.tr/?q=sources on reasonable request.

Conflict of Interest

All authors declare that they have no conflict of interest.

Competing Interests

The authors declare no competing interests.


Authors also would like to thank Argenit Company and Istanbul Medipol University Hospital for providing and annotating the whole slide histopathological images of cervical cancer precursor lesions image data set. The authors state no conflict of interest and have nothing to disclose.


  • [1] Torre, L. A. et al. Global cancer statistics, 2012. CA Cancer J. Clin. 65, 87–108 (2015).
  • [2] Stoler, M. H. Human papillomaviruses and cervical neoplasia: a model for carcinogenesis. Int. J. Gynecol. Pathol. 19, 16–28 (2000).
  • [3] Van Zummeren, M. et al. Hpv e4 expression and dna hypermethylation of cadm1, mal, and mir124-2 genes in cervical cancer and precursor lesions. Modern Pathology 1 (2018).
  • [4] Zur H., H. Papillomaviruses in the causation of human cancers—a brief historical account. Virol. J. 384, 260–265 (2009).
  • [5] Stoler, M. H. Advances in cervical screening technology. Mod. Pathol. 13, 275–284 (2000).
  • [6] Cox, J. T., Wilkinson, E. J. & O’connor, D. M. Historical perspective: terminology for lower anogenital tract pathology. AJSP: Reviews & Reports 18, 158–167 (2013).
  • [7] Darragh, T. M. et al. The lower anogenital squamous terminology standardization project for hpv-associated lesions: background and consensus recommendations from the college of american pathologists and the american society for colposcopy and cervical pathology. Arch. Path. Lab. Med. 136, 1266–1297 (2012).
  • [8] Mitra, A. et al. Cervical intraepithelial neoplasia disease progression is associated with increased vaginal microbiome diversity. Scientific reports 5, 16865 (2015).
  • [9] Stoler, S. M., M. H. et al. Interobserver reproducibility of cervical cytologic and histologic interpretations: realistic estimates from the ascus-lsil triage study. Jama 285, 1500–1505 (2001).
  • [10] McCluggage, W. et al. Interobserver variation in the reporting of cervical colposcopic biopsy specimens: Comparison of grading systems. J. Clin. Path. 49, 833–835 (1996).
  • [11] McCluggage, W. et al. Inter-and intra-observer variation in the histopathological reporting of cervical squamous in traepithelial lesion susing a modified bethesda grading system. Br. J. Obstet. Gynaecol. 105, 206–210 (1998).
  • [12] Doorbar, J. Papillomavirus life cycle organization and biomarker selection. Dis. Markers 23, 297–313 (2007).
  • [13] Al-Janabi, S., Huisman, A. & J., V. D. P. Digital pathology: current status and future perspectives. Histopathology 61, 1–9 (2012).
  • [14] Madabhushi, A. & Lee, G. Image analysis and machine learning in digital pathology: Challenges and opportunities. Med. Image Anal. 33, 170 – 175 (2016).
  • [15] He, L., Long, L. R., Antani, S. & Thoma, G. R. Histology image analysis for carcinoma detection and grading. Comput. Methods Programs Biomed. 107, 538–556 (2012).
  • [16] De, S. et al. A fusion-based approach for uterine cervical cancer histology image classification. Comput. Med. Imaging. Graph. 37, 475–487 (2013).
  • [17] Guo, P. et al. Nuclei-based features for uterine cervical cancer histology image analysis with fusion-based classification. IEEE J. Biomed. Health Inform. 20, 1595–1607 (2016).
  • [18] Naghdy, G., Ros, M. B., Todd, C. et al. Computer aided decision support system for cervical cancer classification. In Applications of Digital Image Processing XXXV, vol. 8499, 849919 (SPIE, 2012).
  • [19] Wang, D., Y.and Crookes, Eldin, O. S., Wang, P., S.and Hamilton & Diamond, J. Assisted diagnosis of cervical intraepithelial neoplasia (cin). IEEE J. Sel. Top. Signal. Process. 3, 112–121 (2009).
  • [20] Keenan, S. et al. An automated machine vision system for the histological grading of cervical intraepithelial neoplasia (cin). J. Pathol. 192, 351–362 (2000).
  • [21] de Melo, F., Lancellotti, C. & da Silva, M. Expression of the immunohistochemical markers p16 and ki-67 and their usefulness in the diagnosis of cervical intraepithelial neoplasms. Rev. Bras. Ginicol. Obstet. 38, 82–87 (2016).
  • [22] Galgano, M. T. et al. Using biomarkers as objective standards in the diagnosis of cervical biopsies. Am. J. Surg. Path. 34, 1077 (2010).
  • [23] Guo, M. et al. Efficacy of p16 and proexc immunostaining in the detection of high-grade cervical intraepithelial neoplasia and cervical carcinoma. Am. J. Clin. Pathol. 135, 212–220 (2011).
  • [24] Lim, S., Lee, M., Cho, I., Hong, R. & Lim, S. Efficacy of p16 and ki-67 immunostaining in the detection of squamous intraepithelial lesions in a high-risk hpv group. Oncol. Lett. 11, 1447–1452 (2016).
  • [25] Ozaki, S., Zen, Y. & Inoue, M. Biomarker expression in cervical intraepithelial neoplasia: potential progression predictive factors for low-grade lesions. Hum. Pathol. 42, 1007–1012 (2011).
  • [26] Achanta, R. et al. Slic superpixels compared to state-of-the-art superpixel methods. IEEE Trans. Pattern. Anal. Mach. Intell. 34, 2274–2282 (2012).
  • [27] Najar, F., Bourouis, S., Bouguila, N. & Belghith, S. A comparison between different gaussian-based mixture models. In 2017 IEEE/ACS 14th International Conference on Computer Systems and Applications, AICCSA’17, 704–708 (IEEE, 2017).
  • [28] Mazurowski, M. A. et al. Training neural network classifiers for medical decision making: The effects of imbalanced datasets on classification performance. Neural Netw. 21, 427–436 (2008).
  • [29] Hong, X., Chen, S. & Harris, C. J. A kernel-based two-class classifier for imbalanced data sets. IEEE Trans. Neural Netw. 18, 28–41 (2007).
  • [30] Zhuang, L. & Dai, H. Parameter optimization of kernel-based one-class classifier on imbalance learning. J. Comput. 1, 32–40 (2006).
  • [31] Liu, W. & Chawla, S. Class confidence weighted knn algorithms for imbalanced data sets. In Pacific-Asia Conference on Knowledge Discovery and Data Mining, 345–356 (Springer, 2011).
  • [32] Zuo, W., Lu, W., Wang, K. & Zhang, H. Diagnosis of cardiac arrhythmia using kernel difference weighted knn classifier. In Computers in Cardiology, 253–256 (IEEE, 2008).

Author contributions statement

A.A. , A.U. and N.C. designed and performed the experiments, G.B., L.D.A and B.U.T were involved in planning and supervised the work, A.C., I.T. and B.M. contributed to the design and implementation of the research. All authors reviewed the manuscript.