Neural Stain-Style Transfer Learning using GAN for Histopathological Images

10/23/2017 ∙ by Hyungjoo Cho, et al. ∙ Korea University 0

Performance of data-driven network for tumor classification varies with stain-style of histopathological images. This article proposes the stain-style transfer (SST) model based on conditional generative adversarial networks (GANs) which is to learn not only the certain color distribution but also the corresponding histopathological pattern. Our model considers feature-preserving loss in addition to well-known GAN loss. Consequently our model does not only transfers initial stain-styles to the desired one but also prevent the degradation of tumor classifier on transferred images. The model is examined using the CAMELYON16 dataset.



There are no comments yet.


page 2

page 5

page 7

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Deep learning based image recognition receives a lot of attention due to its notable application to digital histopathology including automatic tumor classification. Convolutional neural networks(CNNs) have recently achieved state-of-the-art performance in the task of image classification and detection, especially, replaced the traditional rule-based methods in the several contests of medical image diagnosis LeCun et al. (2015); Wang et al. (2016). Such data-driven approach especially depends on quality of training dataset hence it requires sensible preprocesses. In histopathology, staining e.g. haematoxylin and eosin (H&E) is essential to examine the microscopic presence and characteristics of disease not only for pathologists but also for neural networks. For digital histopathology, several stain normalization preprocesses are well-known Ruifrok et al. (2001); Reinhard et al. (2001); Ruifrok et al. (2003); Annadurai (2007); Magee et al. (2009); Macenko et al. (2009); Khan et al. (2014); Li and Plataniotis (2015); Bejnordi et al. (2016).

Figure 1: Samples of tissue tiles from different institutes in CAMELYON16cam (2016), 17cam (2017) dataset. The first row shows normal samples, and the second row shows tumor samples. Samples of institute 1, 2 are included cam (2016), the others are included cam (2017) dataset.

Standard stain normalization algorithms are based on stain-specific color deconvolution Ruifrok et al. (2001)

. Stain deconvolution requires prior knowledge of reference stain vectors for every dye present in the whole-slide images (WSI).

Ruifrok et al. (2001)

suggested a manual approach to estimate the color deconvolution vectors by selecting representative sample pixels from each stain class, and a similar approach was used in

Magee et al. (2009) for extracting stain vectors. Such manual estimation of stain vectors, however, strongly limits their applicability in large studies. Khan et al. (2014) modified Magee et al. (2009) by estimating stable stain matrices using an image-specific color descriptor. Combined with a robust color classification framework based on a variety of training data from a particular stain with nonlinear channel mappings, the method ensured smooth color transformation without introducing visual artifacts. Another approach used in Bejnordi et al. (2016) transforms the chromatic and density distributions for each individual stain class in the hue-saturation-density (HSD) color model. See Bejnordi et al. (2016) and references therein.

Stain normalization methods for histopathological images have been studied extensively, and yet these still possess challenging issues. Most of the conventional methods use various thresholds to filter out backgrounds and other irrelevant dimensions. However, these methods cannot represent the broad feature distribution of the entire target image, thus they require manual tuning of hyper-parameters such as thresholds. Furthermore, since nuclei detection has a significant impact on performance of color normalization, it is unlikely to expect good performance if there is a mistake in the nuclei detection stage. Finally, although the major aim of most conventional approaches is to enhance the prediction performance of classification system, these stain normalization methods and classifer work separately. It is reported that performance of network varies with institutes even they applied same staining methods Ciompi et al. (2017). In order to prevent such variation, it is required to consider a domain adaption method.

In this paper, we propose a novel stain-style transfer method using deep learning, as well as a special loss function which minimizes the difference between latent features of input image and that of target image, thus preserves the performance of the classifier. We implement fully convolutional network (FCNs)

Long et al. (2015) in proposed stain-style generator that learns the color distribution of dataset which is used to train the tumor classifier.

Our contributions in this paper are of two areas. First, we replace the color normalization methods with a generative model which learns certain stain-style distribution of dataset. Second, we introduce feature-preserving loss to induce the classifier to extract better features than different methods.

2 Stain-Style Transfer with GAN

2.1 Stain-Style of Dataset

In this section, we summarize relevant material on our model. Let be a set of institutes and let be the dataset of histological sample and the corresponding label . The class of stained images or color images with RGB channels, denoted by , is defined to be the set of -matrix with entries. Under this setting, we define the stain-style of institute

to be a random variable

with a probability distribution


admits a certain conditional probability distribution

, the definition of different stain-style with the same label makes sense.

2.2 Tumor Classifier Network

Suppose we trained tumor classifier network which infers histological pattern of input image . We write if the classifier is especially trained on dataset which follows stain-style . We estimate the performance of by


where is a loss function for classification e.g. cross-entropy. Practically, we make classifier to learn stained images rather than . Hence one can decompose the classifier by where is an actual network which is trained on dataset with stain-style . In this case, we estimate (1) by


2.3 Stain-Style Transfer Network

Figure 2: Overview of the stain-style transfer network. The network is composed of two transformations: Gray-normalization and style-generator . standardizes each stain-style of color images from different institutes and colorizes gray images following the stain-style of certain institute.

Since the stain-styles of each institute are dissimilar, the histological pattern in image from different institute would break up in the view of classifier network. Consequently, it would show degraded performance Ciompi et al. (2017):

To overcome this problem, we propose stain-style transfer (SST) network which transfers stain-style to the initial . Precisely, our aim is to find a network which satisfies


Due to the change of variable formula (Durrett, 2010, Theorem 1.6.9), (3) implies


hence the tumor classifier recovers its performance (2).

We emphasize that our SST network does not require the dataset of institute to train both and . To make independent of institute , we employ the gray normalization and train stain-style generator such that , as illustrated in Figure 2.

2.4 Stain-Style Generator by Conditional GAN

Figure 3: Illustration of feature-preserving loss. We use the global average pooled layers of input and generated image as the input of feature preserving loss.

To train the style-generator , we introduce three loss functions (a) reconstruction loss, (b) GAN loss, and (c) feature-preserving loss.

2.4.1 Reconstruction Loss

Restricted to the initial , SST network should be an reconstruction map i.e. . Hence we apply a reconstruction loss to minimize the -distance between and its original image using the architecture from Quan et al. (2016) which has very deep structure with short-cut and skip connections. The reconstruction loss is denoted by

2.4.2 Conditional GAN Loss

As Pathak et al. showed in Pathak et al. (2016), mixing GAN loss Goodfellow et al. (2014) with some traditional loss, such as , improves the performance of generator. Since we have labeled images, conditional GAN Mirza and Osindero (2014) was applied instead of Goodfellow et al. (2014). By means of GAN, is to learn a mapping from to and to trick the discriminator . Here is to distinguish between fake and real images using the architecture from DCGAN Radford et al. (2015). We use the following GAN loss

While learns to maximize , tries to minimize it until both arrives at its optimal state. Through the above procedure, every stained image might be transferred to have the desired stain-style. However, this approach often tend to make frequent color images independent of histological pattern. This phenomena is called mode collapse (of GANs) which possibly interrupt achieving (3). Therefore we need an additional loss function.

2.4.3 Feature-preserving Loss

As in (3), in the optimal state , an output of SST network should approximate target

. By the means of Kullback-Leibler divergence, (

3) can be restated by

To obtain , having (4) in mind, we employ the feature-presearving loss

where indicates the feature of given color image extracted from the classifier . As illustrated in Figure 3

, the final layer before the activation function is used to examine feature vector, precisely, global average pooled layer.

Consequently, the overall loss function is

where , are the weights which are used to balance the update between different loss functions.

3 Experiment

We perform quantitative experiment in tumor classification to evaluate the SST network. To show the general performance of our method, we apply the extensions to vanilla models as well as conventional method. We have 4 baseline methods: Reinhard et al. (2001), Macenko et al. (2009), Histogram specification (HS) Annadurai (2007) and WSI color standardization (WSICS) Bejnordi et al. (2016).

3.1 Dataset

The Camelyon16 dataset is composed of 400 slides from two different institutes, Radbound and Utrecht. We use 180,000 patches for training, 20,000 for validation from Radbound and 140,000 patches for testing from Utrecht. The number of tumor and normal are the same. Hypothesizing the training and validation dataset belong to a certain institue and the test set is from another one, we can merge every stain-style into the same space by applying the gray normalization. Both training and validation dataset are labeled, supervised learning can be applied to train the mapping from gray image to the colored one. We used gray normalization based on

Pillow package of python which uses this formula .

3.2 Network Architecture

In this part, we explain each network structure of classifier network and stain-style generator which constitute SST network.

3.2.1 Classifier Network

Classifier network carries out two tasks in experiment. Firstly, it is a discriminator which evaluates the performance of stain-style generator . Secondly, as already explained in subsection 2.4.3, it works as a feature-extractor which is used in feature preserving loss . We use ResNet-34 from torchvision

library in PyTorch as a framework.

3.2.2 Stain-Style Generator

The generator network

is provided an image as an input instead of a noise vector. Therefore we can use FCN type architectures and U-Net is one of the most famous network among them. However, because of its limit of performance, we use FusionNet which has combined the advantages of U-Net and that of ResNet. Hyperparameters of network are set as same as

Quan et al. (2016). We adapt our discriminator architectures from Radford et al. (2015) which is based on VGG-Net without pooling layer. The hyperparameters of discriminator are the same as those in Radford et al. (2015).

3.3 Result

Figure 4: Comparison between SST and other stain normalization method: (a) Target image for transfer (b) Original input image to be transferred (c) SST (d) WSICS (e) HS (f) Marcenko (g) Reinhard

Figure 4 illustrates the result of each stain normalization method on a sample image. Target image comes from Radbound which is used for training the tumor classifier. Original image is sampled from Utrecht, used for testing the tumor classifier. Although there is no visual difference between outputs of each method, the classification performance on these color images varies significantly. Given the experiment results in Table 1, SST network successfully avoids the performance degradation. SST achieves the highest performance on original images on tumor classification with Area Under Curve(AUC) = 0.9185. This result shows that there are difference between visual judgment and the result of classifier. In case of WSICS’s result, which is most visually similar to SST’s, the AUC score is worse than that of SST by about 30%. On the other hand, Macenko, which was visually the worst, performs better than other methods except for SST. Conventional methods consider only the physical features of input images and lose patterns which are key features for classifier’s decision making process. In contrast, SST maintains those key features, input image’s own patterns, and also consider the color distribution of target images as well as the contextual information of original images.

Model Target Original SST WSICS HS Macenko Reinhard
AUC 0.9760 0.8900 0.9185 0.6408 0.4245 0.7169 0.5611
Precision 0.9114 0.8098 0.8440 0.5989 0.4987 0.6983 0.6114
Recall 0.9126 0.8111 0.8460 0.5957 0.4986 0.6956 0.6119
Specificity 0.9583 0.8014 0.8371 0.6010 0.4162 0.6500 0.5471
Table 1: Performance of tumor classifier network on different stain normalization methods. SST network shows significant improvement compared to direct application to original (untransferred image) and outperforms the others.

4 Conclusion

In this work, we have presented a stain style transfer approach to stain normalization for histopathological images. To that end, we replace the stain normalization models with a generative model which learns certain stain-style distribution of training dataset. This stain style transfer network is considerably simpler than contemporaneous work, and produces more realistic results without any additional labeling or annotation for training as well as prior knowledge. Further, unlike conventional stain normalization, which acts independently of the tumor classifier, the proposed feature-preserving loss induces our coloration in a direction that does not affect the tumor classifier. We demonstate that our model is optimized for the performance of the tumor classifier and allows successful stain-style transfer.

The style of chemical cell staining is mainly affected by structural information and morphology of cells rather than factors such as cell brightness. Based on these observation points, we converted the test image into a gray image and performed a stain style transfer process. While this method has the advantage of making the process simpler, it has also lost some information. To resolve the limitation, further investigation will assess direct stain style transfer approach from color image to color image. In addition, we hope to more closely examine parameters of our deep learning approach. Further, we will perform more rounds of hard negative mining and consider the reliability and reproducibility of the deep CNN models.