Histopathological whole slide images (WSI) are generated by scanning tissue images using microscopic scanners. Hematoxylin and Eosin (H&E) stain is the most commonly used stain for these tissue images. Computer aided histopathological image analysis has gained a significant momentum, and there is an increasing use of supervised deep learning (DL) classification and segmentation models for the same. However, histopathological images obtained from one lab differ significantly from other labs, as every lab has a different staining protocol (Fig. 1). Same tissue slides that are scanned using different scanners may also generate different looking images due to difference in color calibration. Despite great generalization ability of DL models, stain variation can pose a serious problem for DL models which are generally trained using images sourced from one or few labs. Various image stain normalization techniques have been proposed to solve this stain variation problem. Experimentation has shown that stain normalization can improve the accuracy of DL models 
. Recently few approaches have been proposed using deep generative neural networks for image stain normalization which have shown promising results, , 
. Building onto that, we present an image to image transformation network, called stain transfer generator network, based on High-Resolution Network (HRNet) architecture to transform images from one stain to another.
1. We propose to use neural style transfer with addition of adversarial loss for image stain normalization. To the best of our knowledge, this is the first work which couples neural style transfer with GAN based framework for histopathological stain normalization 2. We suggest a simple yet effective modification to HRNet: a direct skip connection from input to output. This helps in faster training and improved convergence of stain transfer generator network. 3. We have tested the proposed method on data obtained from 8 different labs.
2 Existing Approaches for Stain Normalization
Histogram matching of RGB channnels of input image with corresponding channel of reference image is the simplest approach, but it requires similar distribution of various tissue components. Reinhard et al. 
uses global image mean and standard deviation in LAB color space for image normalization. Such global methods do not consider local variation in color distribution and hence lead to inaccurate color mapping. If applied on local image patches then there is no consistency in transformed patches across the WSI. Bejnordi uses color and shape based pixel classification into dye types and matches class specific mean and standard deviation.
Color deconvolution  based stain separation methods have been proposed, which separate images into H and E stain components and apply normalization on each of them separately. A pioneering work in automatic color deconvolution matrix computation uses non-negative matrix factorization for stain unmixing . Recent work by Vahadane et. al 
uses sparse non-negative matrix factorization (SNMF) for this task. Any stain separation method transforms three channel color image into image having two dimensional basis with each basis vector representing H and E color component. In all the reconstructed images, we have observed red color of RBCs getting wrongly transformed to pink color of cytoplasm. Loss of such vital color information can cause significant degradation of DL models which depend on RBC features. Also, this method does not classify whether the component belongs to H stain or E stain automatically.
Recently, some authors have used deep learning to solve stain color normalization problem. In 
, image patch is divided into several classes using K-means clustering on features obtained from trained auto-encoders and histogram matching is done on each cluster to match image stain. Zanjani et al.
, uses deep convolutional GMM to do image clustering and then applies cluster-wise color transfer. Performance of cluster based color matching techniques depends heavily on clustering accuracy and even slight misclassification of cluster can introduce severe degradation. Some approaches use grayscale image colorization using GAN for stain normalization , . If we could obtain complete color information from grayscale images, then it would have been easier to train deep learning models directly on the grayscale images and save significant computation time and the need for stain normalization. However, segmentation of some features invariably need color images and hence grayscale image colorization cannot be helpful there. In , authors have jointly trained GAN based stain transformation network along with classifier. CycleGANs have also been used for image stain normalization , additionally authors in ,  proposed to use identity loss in addition to cycle consistency loss. In , authors have used deep convolutional features for pairing reference image patches with input image patch and then compute a global color transformation matrix .
3 Proposed Method
The problem of histopathological image stain normalization can be formulated as follows:
Let corresponds to a set of reference stain images, be a set of input stain images and be a set of transformed images. We want to find a transformation such that is minimum and is minimum
Here is stain style similarity measure which defines the similarity between two sets of stains and is content similarity measure between input image and transformed image.
This is a highly ill-posed problem where a concrete similarity measure is also not available. Our approach to solve this problem is inspired from neural style transfer  where a pre-trained neural network is used as a feature extractor for style as well as content features. The original neural style transfer method follows iterative optimization approach to transfer each of the input images to desired style on the fly, making it very slow. Instead, here a stain transfer generator CNN is trained using a small set of paired input and reference stain images. Training a neural network to do this task is similar to fast neural style transfer proposed by Johnson et al. .
3.1 Loss Definition
Let be input image, be reference image and be transformed image (the output of stain transfer generator network), then content and style loss is computed as follows:
3.1.1 Content Loss
is computed from the content features() obtained by passing the input image and the transformed image through a pre-trained CNN (eg. VGG16 or VGG19). Here corresponds to the features from the feature map of the layer of the CNN at the location.
3.1.2 Style Loss
is computed from the style representation () obtained by passing the reference image and the transformed image through a pre-trained CNN (eg. VGG16 or VGG19). Style representations are the Gram matrices of CNN features at each layers. Each element of a Gram matrix at location is given by the inner product of the vectorized and feature maps:
Style loss corresponding to each layer is given below where is total number of feature maps and is the size of the feature map at layer :
Total style loss is computed as the weighted sum of the style loss for each layer
Further, we propose to add an adversarial loss to the above loss, thus making our network a generative adversarial network. Generative adversarial network first proposed by Goodfellow et el.
has been extensively used in image to image translation tasks due to its ability to generate new images appearing as if they have been drawn from the reference domain.
To compute adversarial loss an additional network called discriminator () has to be trained such that it distinguishes between original reference stain images and fake images generated using stain transfer generator network.
3.1.3 Discriminator Loss
is computed as follows:
3.1.4 Adversarial Loss:
The generator has to generate images that are similar to reference stain images so that the discriminator can not distinguish them from the real samples drawn from reference stain images. For this purpose, we need to minimize the adversarial loss defined as follows:
3.1.5 Total Generator Loss
is computed as a weighted combination of content loss, style loss and adversarial loss, where each of the weighting factors (, and ) are model hyper-parameters.
To train the stain transfer generator network, the discriminator loss () and the total generator loss (
) are iteratively minimized. While training the discriminator, the generator weights are freezed and vice versa while training the generator, the discriminator weights are freezed. However, the weights of the pre-trained CNN network, which is used only as a deep feature extractor, must be kept constant throughout the training process. Fig.2 is a block diagram of the complete system that illustrates how various generator losses are computed.
3.2 Generator Network Architecture
In the artistic style transfer, retention of the minute details in the transformed image is not very important. But the stain transfer generator network must retain the high resolution content features in the transformed image. Hence instead of ResNet based style transfer network proposed by Johnson et al, our stain transfer generator architecture is based on High-Resolution Net (HRNet). HRNet has been shown to retain rich high resolution representations, which helps in maintaining the quality of histopathological images. As a major modification to HRNet, a direct skip connection from input to output has been added. Fig. 3 shows full generator network architecture with feature map size at each layers. Let be the input image and be the stain transfer generator network then,
Intuition behind the skip connection is that, in the worst case scenario, the stain transfer generator network should at least return the original content image. Due to a direct skip connection, the network has to learn only the residue transformation instead of the actual transformation. This residual learning is much faster and gives better convergence. The proposed stain transfer generator network learns good transformation with as little as 20 paired training images and 50 epochs of training.
3.3 Implementation Details
WSIs are divided into non-overlapping patches of 512 x 512 pixels at 10x magnification. Around 20 patches from each input stain is paired to reference stain patch based on content similarity. In stain transfer generator network, all convolutional layers except last one are followed by batch-normalization and ReLU activation. The discriminator follows architectural guidelines given in DCGAN and uses instance-normalization instead of batch-normalization. Also, our discriminator implementation is 16 x 16 PatchGAN 
and hence it classifies 16x16 patch of an image as real or fake. The generator learning rate is kept at 1e-3 and the discriminator learning rate is kept at 1e-5. For training, we have used Adam optimizer and batch size of 4. Content features are taken from second convolutional layer of VGG19 (conv2_2) pre-trained on ImageNet dataset. For style loss, weight of each layer is kept constant 1.
Throughout the experimentation, lab 1 (see Fig. 1) is selected as a reference stain and other images are transformed to this stain. To check the effectiveness of our proposed method, we have compared the stain transfer performance in 4 different settings: 1. neural style transfer using generator network proposed in (NST) 2. neural style transfer with addition of adversarial loss (NST_AD) 3. neural style transfer using modified HRNet (NST_HRNet) and 4. neural style transfer with adversarial loss and modified HRNet (NST_AD_HRNet). All the models are trained with their best hyper-parameters. It has been observed that NST gave good results with Lab 3 but very poor results with Lab 2. NST_AD has generated images looking like reference stain but has distortions. NST_HRNet has good results with both lab 2 and lab 3 with minute content details preserved. NST_AD_HRNet has generated best results for both the labs, with images looking very close to the reference stain (Fig. 4).
For result comparison with other well-known approaches (Reinhard , Vahadane , Zanjani  and StainGAN ), we use publicly available ICPR2014 Mitosis dataset111https://mitos-atypia-14.grand-challenge.org/Dataset/ in which the same tissue slides are scanned using two different scanners (Aperio and Hamamatsu). Hamamatsu scanner is taken as reference and Aperio scanner images are transformed to it. In Fig. 5
, we can see that Vahadane’s method makes red RBCs pinkish as we had discussed previously. Zanjani’s DCGMM produces good output but has sharpened texture and brighter colors as compared to actual reference images. Also, due to large textural variations in kidney images, DCGMM does not produce good clusters and gives poor stain normalized output. StainGAN produces decent output for mitosis images, however, for kidney images it produces images that look like reference stain but with significant textural distortions. We have observed that some of the image similarity measures like SSIM and PSNR are quite misleading and do not indicate true perceptual similarity between images, hence we have used deep features based perceptual similarity measure by Zhang et al., which has been shown to perform significantly better perceptually than other measures. On this measure, the performance of the proposed approach is only 3% poorer than StainGAN and 20% better than Zanjani’s approach (Tabel 1). While CycleGAN based StainGAN took more than 72 hours to train, the proposed network got trained in less than 2 hours. Also, our generator network did not suffer from training instability generally observed with GAN. See appendix for the comparison on kidney data from all the labs.
|Lab||Lab 2||Lab 3||Lab 4||Lab 5||Lab 6||Lab 7||Lab 8|
|Without Normalization Dice||0.183||0.754||0.804||0.718||0.805||0.593||0.641|
|With Normalization Dice||0.437||0.904||0.815||0.900||0.899||0.787||0.771|
To validate the effectiveness of the proposed approach in improving the accuracy of a pre-existing DL model on new unseen stain images, testing has been done on 7 different stains. We have trained a ResNet50-FCN based Glomeruli segmentation network on image patches taken from lab 1. Color based data augmentations were applied on training data. For lab 2 to lab 8, we trained corresponding stain transfer generator network to transform each of them to lab 1. Table 2 shows the performance of segmentation model with and without proposed stain normalization. There is a significant improvement in model performance due to the proposed stain normalization for all the labs.
In this paper we presented a novel, fast and effective stain color normalization technique for H&E stained histopathological images. The proposed method was compared with other well known techniques and gave superior stain normalization performance. Due to a direct skip connection from input to output, stain transfer generator network can be trained quickly and with very few paired training images. This helps in quick adaptation of existing deep learning models for a differently stained data. The proposed method was tested on data obtained from 8 different labs and was able to transform each of them to the reference stain without any distortions in the original content. We believe this work can be extended to staining dyes other than H&E and can also be used for domain adaptation like transforming images from H&E to PAP stain.
-  Bejnordi, B.E., Litjens, G.J.S., Timofeeva, N., Otte-Höller, I., Homeyer, A., Karssemeijer, N., van der Laak, J.: Stain specific standardization of whole-slide histopathological images. IEEE Transactions on Medical Imaging 35, 404–415 (2016)
-  Bentaieb, A., Hamarneh, G.: Adversarial stain transfer for histopathology image analysis. IEEE Transactions on Medical Imaging 37(3), 792–802 (March 2018). https://doi.org/10.1109/TMI.2017.2781228
Cho, H., Lim, S., Choi, G., Min, H.: Neural stain-style transfer learning using gan for histopathological images (2017)
-  Ciompi, F., Geessink, O., Bejnordi, B.E., de Souza, G.S., Baidoshvili, A., Litjens, G.J.S., van Ginneken, B., Nagtegaal, I., van der Laak, J.: The importance of stain normalization in colorectal tissue classification with convolutional networks. IEEE 14th International Symposium on Biomedical Imaging pp. 160–163 (2017)
de Bel, T., Hermsen, M., Kers, J., van der Laak, J., Litjens, G.: Stain-transforming cycle-consistent generative adversarial networks for improved segmentation of renal histopathology. In: Cardoso, M.J., Feragen, A., Glocker, B., Konukoglu, E., Oguz, I., Unal, G., Vercauteren, T. (eds.) Proceedings of The 2nd International Conference on Medical Imaging with Deep Learning. Proceedings of Machine Learning Research, vol. 102, pp. 151–163. PMLR, (08–10 Jul 2019),http://proceedings.mlr.press/v102/de-bel19a.html
-  Gadermayr, M., Appel, V., Klinkhammer, B.M., Boor, P., Merhof, D.: Which way round? a study on the performance of stain-translation for segmenting arbitrarily dyed histological images. In: Frangi, A.F., Schnabel, J.A., Davatzikos, C., Alberola-López, C., Fichtinger, G. (eds.) Medical Image MICCAI. pp. 165–173. Springer International Publishing, Cham (2018)
-  https://doi.org/10.1109/CVPR.2016.265
Ghazvinian Zanjani, F., Zinger, S., de With, P.H.N.: Deep convolutional gaussian mixture model for stain-color normalization of histopathological images. In: Frangi, A.F., Schnabel, J.A., Davatzikos, C., Alberola-López, C., Fichtinger, G. (eds.) MICCAI. pp. 274–282. Springer International Publishing, Cham (2018)
-  Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial nets. In: Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N.D., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems 27, pp. 2672–2680. Curran Associates, Inc. (2014), http://papers.nips.cc/paper/5423-generative-adversarial-nets.pdf
-  Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) pp. 5967–5976 (2016)
Janowczyk, A., Basavanhally, A., Madabhushi, A.: Stain normalization using sparse autoencoders (stanosa): Application to digital pathology. Computerized Medical Imaging and Graphics57 (05 2016). https://doi.org/10.1016/j.compmedimag.2016.05.003
Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super-resolution. In: European Conference on Computer Vision (2016)
-  Rabinovich, A., Agarwal, S., Laris, C., Price, J.H., Belongie, S.J.: Unsupervised color decomposition of histologically stained tissue samples. In: Thrun, S., Saul, L.K., Schölkopf, B. (eds.) Advances in Neural Information Processing Systems 16, pp. 667–674. MIT Press (2004), http://papers.nips.cc/paper/2497-unsupervised-color-decomposition-of-histologically-stained-tissue-samples.pdf
-  Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks (2015)
-  Reinhard, E., Adhikhmin, M., Gooch, B., Shirley, P.: Color transfer between images. IEEE Computer Graphics and Applications 21(5), 34–41 (July 2001). https://doi.org/10.1109/38.946629
-  Ruifrok, A.C., Johnston, D.A., et al.: Quantification of histochemical staining by color deconvolution. Analytical and Quantitative Cytology and Histology 23(4), 291–299 (2001)
-  Shaban, M.T., Baur, C., Navab, N., Albarqouni, S.: Staingan: Stain style transfer for digital histological images. In: IEEE 16th International Symposium on Biomedical Imaging. pp. 953–956 (April 2019). https://doi.org/10.1109/ISBI.2019.8759152
Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2019)
-  Vahadane, A., Peng, T., Sethi, A., Albarqouni, S., Wang, L., Baust, M., Steiger, K., Schlitter, A.M., Esposito, I., Navab, N.: Structure-preserving color normalization and sparse stain separation for histological images. IEEE Transactions on Medical Imaging 35, 1962–1971 (2016)
-  Yuan, E., Suh, J.: Neural stain normalization and unsupervised classification of cell nuclei in histopathological breast cancer images. CoRR abs/1811.03815 (2018), http://arxiv.org/abs/1811.03815
-  Zanjani, F.G., Zinger, S., Bejnordi, B.E., van der Laak, J., de With, P.H.N.: Stain normalization of histopathology images using generative adversarial networks. IEEE 15th International Symposium on Biomedical Imaging pp. 573–577 (2018)
-  Zhang, R., Isola, P., Efros, A.A., Shechtman, E., Wang, O.: The unreasonable effectiveness of deep features as a perceptual metric. In: CVPR (2018)