Red blood cell image generation for data augmentation using Conditional Generative Adversarial Networks

01/18/2019 ∙ by Oleksandr Bailo, et al. ∙ NOUL Inc. 14

In this paper, we describe how to apply image-to-image translation techniques to medical blood smear data to generate new data samples and meaningfully increase small datasets. Specifically, given the segmentation mask of the microscopy image, we are able to generate photorealistic images of blood cells which are further used alongside real data during the network training for segmentation and object detection tasks. This image data generation approach is based on conditional generative adversarial networks which have proven capabilities to high-quality image synthesis. In addition to synthesizing blood images, we synthesize segmentation mask as well which leads to a diverse variety of generated samples. The effectiveness of the technique is thoroughly analyzed and quantified through a number of experiments on a manually collected and annotated dataset of blood smear taken under a microscope.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

page 5

page 6

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Deep learning based methods have had a great success on a number of typical visual perception tasks such as classification, segmentation, and object detection. While there are a number of important reasons that have made this progress possible, one of them is the availability of large-scale datasets. On the contrary to the traditional computer vision tasks, not much effort has been put to the creation of large-scale medical image datasets. There are several reasons these datasets are hard to create.

First of all, there is a limited number of data annotators available. In contrast to traditional image annotation (e.g. class label, segmentation mask, bounding box) where fairly any person is able to perform the annotation, medical data requires a specialized professional—often with an advanced medical degree—to perform a reliable annotation.

Figure 1: Image translation example. First row: real segmentation mask (left) and corresponding blood image (right). Second row: synthetic segmentation mask (left) and synthesized image (right).

Secondly, medical data sharing is not a straightforward process. In order to democratize medical dataset, an agreement from a number of involved parties such as patients, doctors, hospitals, and data users should be reached. Additionally, it requires a detailed guideline on what grounds and for what purposes the data can be utilized.

Lastly, different hospitals and countries around the world might use different medical protocols, devices, or mechanisms. Thus, this data could not be easily transferable, requiring special managing to standardize it for the creation of a large-scale dataset.

Figure 2: The overview of the proposed method. First, all blood cell shape instances are extracted and saved to a database (a). Meanwhile, pix2pixHD framework is trained to translate segmentation mask to blood cell images (b). During the inference stage, synthetic segmentation mask is created (c) and fed to the generator network to produce the realistic blood cell image (d).

All the of aforementioned reasons greatly increase the cost of the annotation while overall making a large-scale medical dataset creation extremely time and resource consuming. Fortunately, with the advent of the Generative Adversarial Networks (GANs) [11] a powerful image synthesis has become possible. Recently, Wang et al. have proposed a model named pix2pixHD [30] that performs high-resolution photorealistic image generation given the instance segmentation mask of the scene. Since the segmentation masks are easy to be manipulated, this method allows interactive object manipulation leading to numerous new samples.

In this paper, we focus on generating new training samples of blood images taken with a microscope. For this purpose, we have collected a dataset and manually annotated every single blood cell in images with an instance level segmentation label. Next, we train GAN to perform translation of a segmentation mask to a photorealistic blood image.

Once the network is trained, we automatically generate diverse instance segmentation masks on run (see Figure 1), and create a defined number of various data samples which are further used as an additional training data for segmentation and object detection tasks. These generated samples serve as a powerful data augmentation technique that could boost the performance of relevant tasks. The graphical summary of the algorithm is presented in Figure 2.

This paper is structured in the following way. In Section 2, we cover the related works which similarly have utilized GANs for medical data generation. Then, Section 3 describes the dataset creation. Section 4 specifies a procedure for a synthetic instance segmentation mask generation as well as details on the utilized GAN model. Lastly, a number of experiments and discussion are presented in Section 5 followed by conclusion and future plan (Section 6).

2 Related Work

The use of GANs with medical image data is not new [36, 15, 31]. GANs have demonstrated promising results with a medical data in a number of tasks such as segmentation [16, 33, 19, 28, 34, 37], detection [24, 2, 7, 25, 29], and image synthesis [10, 32, 3, 1, 22].

Since existing medical datasets are often limited in size, several works have examined the use of GANs for data augmentation purposes to increase the number of training samples. For instance, Calimeri et al. [5] have concluded the possibility of employing synthetically generated MRI images for inexpensive and fast data augmentation. Similarly, several other works have observed an improved performance on segmentation of various data sources such as MRI [20, 26], CT [4], and X-ray [21].

Similar conclusions of favorable use of GANs as data augmentation technique have been observed for classification of CT images [9] and liver lesion classification [8].

Figure 3: Samples from the dataset: blood cell images (first row) and corresponding instance segmentation masks (second row).

While there are many works targeting MRI, CT, and X-ray image data, blood smear images taken with a microscope have got less attention. Thus, in this paper, we explore the use of GANs for blood image synthesis and study an effect of utilizing GANs as a data augmentation method for segmentation and object detection tasks.

3 Dataset

In order to obtain the dataset with Red Blood Cells (RBCs), we have collected blood from patients and prepared stained blood smear slides. Then, we have manually selected images to ensure every image is distinctive to diversify the dataset. This results in a great variation of a number of RBCs per image, their shape, and color values (see Table 1). For example, the largest cells could reach resolution, while the smallest could be of several pixels (i.e. partially visible cells on the border of the image). Every image is of resolution in RGB format. The dataset is randomly split into training and testing sets with a ratio of 60 to 40 respectively. Every image is of resolution in RGB format.

The annotation technique of the dataset is inspired by the 2015 MICCAI Gland Segmentation Challenge [27]. Specifically, given a microscope image with blood smear, we aim to create an instance segmentation mask which would allow us to extract every single cell in a given image. For this purpose, qualified professionals manually draw precise segmentation mask for each RBC.

This drawing process consists in putting a color on every single pixel belonging to a blood cell. Color values are selected by the user from a predefined list of color values. In order to be able to extract every single blood cell without an overlap with others, any touching cells have different segmentation mask colors assigned to it, hence, no touching cells having the same color.

While this dataset also includes annotation for other blood cells and noise (e.g. dust), in this paper, we primarily focus on RBCs. The representative images and corresponding instance segmentation masks are shown in Figure 3.

Mean Std
RBC count per image
RBC shape
Colors (in RGB)
Table 1: Statistics of red blood cells in the train set.

4 Methodology

The creation of new samples is composed of two stages: synthetic mask generation and translating generated mask to a photorealistic image of blood cells. The graphical visualization of the whole pipeline during training and testing phases are shown in Figure 2.

4.1 Synthetic mask generation

In order to generate new and meaningful photorealistic samples of the blood cells, we first need to produce synthetic instance segmentation masks in which the blood cells have unique shapes and location distribution. For this purpose, we have designed a synthetic segmentation mask generator that combines sampled cell shapes and their distributions to produce synthetic segmentation masks. More formally, we formulate a synthetic segmentation mask as a set of sampled cell shapes which are located at their corresponding locations on the background:

(1)

where denotes the shape of the cell and determines the location of the cell. The total number of cells in an image

is drawn from a normal distribution as

, where and are determined from the statistics of training data (see Table 1).

4.1.1 Cell shape sampling

To model the natural variety and similarity of cell shapes in blood, we perform exemplar-based cell shape sampling. During the training stage, segmented boundaries of each cell in the training data are accumulated to build the cell shape database. This database serves as a blood cell shape supplier during the inference stage. In the inference stage (i.e. generating new samples), the cell shape sampler iteratively selects random cell shape from the database and puts it on the instance segmentation mask which is composed of numerous cells. Moreover, whenever we pick from the database, a set of probabilistic augmentations are applied to diversify the appearance. The augmentation includes rotation, scale, horizontal and vertical flipping.

4.1.2 Cell distribution sampling

The most straightforward way to locate the sampled cell shapes in the synthetic segmentation mask is simply to iteratively sample the coordinates at random and place the cell shape masks at the sampled location if the place is not occupied. However, such an approach results in a nonrealistic cell distribution. Specifically, this leads to a uniform random distribution of cells all over the image with many cells being placed solo without touching other cells. In reality, due to cell adhesion [12], cells tend to stick to each other, consequently, forming clusters.

Therefore, in order to generate the segmentation masks which incorporate the aforementioned aspect, our cell distribution algorithm sequentially samples the appropriate location of each cell from the probability density function defined on the 2D discrete space. The location of i-

th cell is sampled from the probability map as:

(2)

where each pixel value in denotes the probability of being selected as a location of a cell’s center at time step . is initially uniform during the sampling of the first cells, but changes its landscape as increases in order to simulate inherent cell adhesion. We have modeled this evolving nature of as a Markov random process.

(3)

Specifically, this explains that the probability map incrementally changes as accumulating the function to its previous state. The excitation function is calculated by applying a 2D Gaussian function with around the sampled cell center and reverting the values within the around it. This is done to impose low probability within boundaries of already allocated cells to prevent a cell being placed at the occupied location. The amount of increment is controlled by the normalizing coefficient , which is in the form of a harmonic progression of . (i.e. ). At any time stamp, always maintains the ‘sum to unity’ property.

In practice, during the cell placement on the synthetic mask, at any time the cell is placed on the canvas, a specific color is assigned to it to satisfy the condition that no touching cells have the same color. If such constraint is impossible to maintain—many touching cells or all predefined colors are used—the coordinate sampling is repeated. Since every touching cell has a different color, produced synthetic mask can be treated as an instance segmentation mask with a possibility to extract every single cell.

This synthetic mask generation process, as well as the comparison of randomly distributed cells against the proposed strategy, is visually described in Figure 4.

Figure 4: Synthetic mask generation. (Top) intermediate segmentation and probability maps of the algorithms. (Bottom) compares the final result of random placement and the proposed method.

4.2 Synthetic blood image generation

For the purpose of synthesizing photorealistic blood cell image given the instance segmentation mask, we have utilized a recent pix2pixHD framework proposed by Wang et al. [30]. For our scenario, the pix2pixHD framework is composed of Generator which tries to translate segmentation mask to a photorealistic blood cell image. At the same time, two multi-scale generators are trying to distinguish real images from the generated ones. The full training objective of the network is the following:

(4)

where:

  • is the adversarial loss defined as:

    (5)
  • is the feature matching loss that aims to stabilize training and produce more visually appealing results at multiple scales. It is defined as:

    (6)
  • is the perceptual reconstruction loss aiming to further improve the performance of high-quality image generation:

    (7)

As suggested in [30], we incorporate feature encoder network and combine its output with original input to be able to manipulate image synthesis style easily. Since it is a relatively easy task to generate blood cells, we only use the global generator without a local enhancement.

During the inference stage, we feed synthetically generated mask to the generator to obtain a synthetic image of blood cells. The style of the output image is influenced by randomly sampling features from one of

clusters which are created by running K-means clustering on the outputs of the encoder

supplied with the training images.

For a detailed description of the losses, feature encoding, and clustering please refer to the original paper [30]. Implementation details, as well as training procedures, are described in Section 5.1.

5 Experiments and Results

In this section, first, we cover specific details on parameters and training strategy for blood cell images synthesis. Later, the majority of this section describes various experiments on segmentation and object detection tasks with a focus on the effect of the use of synthetic images during training on the performance of the networks.

5.1 Synthetic blood image generation

In order to decide the number on cells to be placed on the synthetic mask, we sample the number from a normal distribution with a mean and standard deviation taken from the dataset statistics (see Table 

1 “RBC count per image” row). In the case of the probability map creation, during the synthetic mask generation, the standard deviation value

of 2D Gaussian distribution is related to half width at half maximum (HWHM)

111It is half of the distance between points on the curve at which the function reaches half its maximum value. value. Specifically, we want HWHM value to be equal to . Hence, we can derive that , where (taken from Table 1). Initially, when a synthetic segmentation mask is empty, cells are placed at random without considering the probability map.

The training of the pix2pixHD is performed in two stages due to the GPU memory constraint. In the first stage, the images are downsized to and global generator , discriminators , and feature encoder are trained simultaneously for epochs. Then, feature encoder is used to precompute image features of the training data. In the second stage, the training of and is performed using full-resolution images (i.e. ) for additional epochs. After training is completed, 10 clusters are created using encoded features for each semantic category. These clusters are used to simulate different blood cells style leading to the diversification of generated images.

In all experiments, we have utilized original pix2pixHD network with several modifications. For instance, the number of filters in the first convolutional layer of , , and are set to . The primary goal for such channel tuning is to fit the model within the GPU memory capacity. Horizontal and vertical flipping are utilized as data augmentation.

The pix2pixHD network is trained with a train set solely. Figure 6 shows network outputs on the test set images, hence, we can compare synthesized and corresponding real images. Certainly, the network is able to learn shapes, color, and boundaries of the blood cells given their segmentation mask and generally to produce realistic blood images. Noticeably, when synthetic and real images are compared one-to-one, they do not necessarily have the same colors and intensities values. Furthermore, due to the feature encoding which influences the style of individual cells, synthetic images often have excessive noize embedded to generated blood cells (see Figure 6 (last column)). While the style is selected at random during the synthesis stage, we could manually define the cluster where features are sampled from. For example, Figure 7 shows different styles of the generated blood cell image given the segmentation mask from the first column of the Figure 6.

Figure 5: Examples of generated synthetic mask (left) and corresponding synthesized blood cell images (right).
Figure 6: Generated samples on test set: segmentation masks (first row), ground truth (second row), and generated blood cells (third row).
Figure 7: Example of different styles of the synthesized blood cell image.

The output results of the whole pipeline including synthetic mask generation and blood image synthesis can be seen in Figure 5. Noticeably, the proposed method is able to generate synthetic segmentation mask which holds cell adhesion rule (i.e. intercellular forces make cells group with each other). Furthermore, utilizing these synthetic segmentation masks as an input to the blood image generator, the results in realistic blood images of a different style (e.g. color and intensity values, noise level).

Unoptimized Python implementation of synthetic mask generation on average takes seconds (using Intel Xeon) per mask and it is heavily dependent on the number of cells to be placed on the mask. The image generation takes on average seconds per image (using GeForce 1080Ti). Therefore, in the current implementation form, proposed augmentation method better not be used for a real-time data augmentation, but rather for an offline technique to increase the training samples.

5.2 Segmentation

For this task, we have utilized FCN-8s [18]

model. Loss function is formulated as

, where stands for a Dice score that is defined as:

(8)

where it is stated as vector operation over binary vectors

and corresponding to the network prediction and corresponding target (i.e. ground truth) respectively.

The Dice score is used as an evaluation metric as well. The model is trained from scratch without any pretraining or transfer learning until the training loss is converged and no improvement is observed on the test set. The best scoring weights on the test set are selected for reporting results.

Methods \Tasks Segmentation task (Section 5.2). Metric: Dice score.
RD SD RD+aug RD+SD RD+SD+aug
FCN [18] 0.961 0.848 0.962 0.948 0.964
Detection task (Section 5.3). Metric: AP.
Faster R-CNN [23] 0.781 0.852 0.985 0.986 0.993
Detection from segmentation (Section 5.4). Metric: AP.
DCAN [6] 0.853 0.749 0.885 0.848 0.895
Table 2: Quantitative evaluation on various tasks. Column names represent a different combination of data used for training of the networks: real data(RD), synthetic data(SD), and augmentations (aug).

Table 2 demonstrates the quantitative results with different data used as training instances. For instance, “RD” column represents results where the network is only trained with real data (RD). Similarly, in the “SD” case, exclusively synthetic data (SD) is used for the training. In both of these cases, the number of training samples is intentionally set equal to samples. The remaining columns “RD+aug”, “RD+SD”, and “RD+SD+aug” represent results where training set is composed of real data with augmentations, real data mixed with synthetic data, and augmented real and synthetic data respectively. The data augmentation includes horizontal and vertical flips, Gaussian blur, sharpening, embossing, Gaussian noise, color channel inversion, brightness change (addition and multiplication), contrast normalization, and grayscale augmentation applied randomly. For these scenarios, the number of training samples is identical and equals samples.

FCN is a powerful model capable of reaching a high dice score of by utilizing real data exclusively without any augmentations. Though, when the real data is replaced with synthetic one, the network reaches a substantially lower score of . This could imply that the real data has richer semantic information compared to the completely synthetic images that tend to approximate the real data.

Real data augmentation slightly helps to boost performance by about percent points. However, utilizing synthetic data as an augmentation method alongside real data actually harms the performance of the model decreasing the score by roughly . Lastly, if the real and synthetic data is trained with augmentations together, the maximum score of is reached.

To sum up, this experiment implies that the use of generated synthetic blood cell images is beneficial to the performance of the network in this particular circumstances. The similar conclusions have been observed in the previous works [20, 26, 4, 21] as well.

5.3 Detection

For this task we have utilized Faster R-CNN [23] based on ResNet-101 [13] feature extractor. The model is pretrained on Visual Genome [17], and we have fine-tuned the model with blood cell data sets mentioned in Table 2. For this task average precision (AP) is used as an evaluation matric. The implementation of the network is provided by [35].

Faster R-CNN is a heavy network which requires lots of training data. Hence, utilizing only limited real data for training results in a relatively low AP score of 0.781 (see Table 2). Surprisingly, the network performs better (about relative improvement) when synthetic data alone is used for training while every other factor is kept equal. While this is counter-intuitive, it could be explained by the fact that, in order to produce synthetic blood images, we produce synthetic segmentation masks which comprise of blood cells shapes derived from a whole training set. This results in new cell shape distribution for each synthetic image. Furthermore, when the cell shape is placed on the segmentation mask, we additionally apply augmentations for each cell shape which leads to even more diverse samples compared to the real training set.

Contrarily to the segmentation task, augmentation provides a drastic improvement (about ) to the generalization capabilities of the model for the cell detection task. Similarly, a noticeable improvement is observed if synthetic data is used alongside the real data. Lastly, using real and synthetic data with augmentations provides even further improvement (about ) to the performance compared to the one without synthetic images.

Overall, this experiment shows a marginal benefit of synthetic data on the detection task when used alongside real data. Utilization of synthetic data during the training performs on par with traditional data augmentation techniques and could be used alongside real data and augmentations to maximize the model performance.

5.4 Detection from segmentation

While Faster R-CNN model has shown a great performance on the cell detection task, this model has several significant obstacles. First of all, it struggles with cluttered objects which highly overlap with each other. This is a potentially difficult problem because it involves a large number of small highly-overlapping objects (i.e. small object detection problem). Secondly, such a complex object detection models are often quite computationally expensive, which might be a problem for point-of-care medical devices which often require a fast and efficient computation on the edge. Lastly, this model is relatively slow and hard to stably train with numerous parameter tuning required.

In order to implement a method which is light-weight and fast, while maintaining satisfactory detection score, we utilize an idea proposed by Chen et al. [6]. Specifically, the segmentation network is modified to predict objectness (i.e traditional segmentation mask) as well as contours (i.e. boundary) of the blood cells. These two predictions are further processed to isolate individual cells from each other which simplify the detection problem to a blob detection on the binary image. Therefore, this method is suitable for the detection task and can be evaluated with average precision (AP) score as in the previous experiment.

Similarly to the segmentation task described in Section 5.2, we have utilized FCN network but with the modified head to output objectness and contour predictions. The network is trained from scratch, with a loss which is simply a sum of objectness and contour losses. The best performing (defined by the AP score) model on a test set is selected for results stated in this paper.

The evaluation results are stated in Table 2. Since this method relies on segmentation, the detection performance generally follows the trend of segmentation prediction from Section 5.2. For example, utilizing only synthetic data harms the performance of the network by AP score. Also, the use of data augmentation helps to marginally boost to the network performance. Finally, using real and synthetic data with augmentations helps to reach the highest AP score of . When this approach is compared to Faster R-CNN model, only in the case the real data is used for training, this method is able to achieve better performance. However, in every other scenario, Faster R-CNN outperforms this approach by a large margin.

While with this approach we have not succeeded to outperform powerful Faster R-CNN model, we confirm the possibility of achieving blood cell detection from the segmentation mask. The quality of detection is directly related to the quality of predicted segmentation mask. Therefore, we believe that the detection performance could be greatly increased by utilizing more recent neural networks such as 

[14] for more accurate medical image segmentation.

6 Conclusion

In this paper, we have developed a method to synthesize photorealistic microscopy blood images by utilizing conditional generative adversarial networks. These synthetic images are used alongside real data to meaningfully increase small datasets. The effect of such data augmentation technique is studied through a number of experiments on several tasks. While the use of synthetic images is shown to be marginally beneficial for the segmentation task, the performance on a detection task demonstrates a slightly stronger relative improvement.

To sum up, based on the experiments we have performed, for our specific strategy and algorithm design, the use of GANs as a synthetic data generator and further utilization of generated samples as an augmentation technique is usually beneficial for the model performance. However, the additional overhead which comes from designing GAN model, long and unstable training, heavy computational requirements, and other challenges might not justify the marginal improvement to the overall performance for a specific task.

In the current version, the proposed method is limited to generating microscopy red blood cell images. In future work, we plan to extend this method to synthesize other blood cells such as white blood cells, and platelets. Additionally, we would like to be able to synthesize parasite-infected cells —red blood cells with a visible parasite inside them —which would be beneficial for identification of various diseases such as malaria.

References

  • [1] K. Armanious, C. Yang, M. Fischer, T. Küstner, K. Nikolaou, S. Gatidis, and B. Yang. Medgan: Medical image translation using gans. arXiv, 2018.
  • [2] C. F. Baumgartner, L. M. Koch, K. C. Tezcan, J. X. Ang, and E. Konukoglu. Visual feature attribution using wasserstein gans. In CVPR, 2017.
  • [3] A. Beers, J. Brown, K. Chang, J. P. Campbell, S. Ostmo, M. F. Chiang, and J. Kalpathy-Cramer. High-resolution medical image synthesis using progressively grown generative adversarial networks. arXiv, 2018.
  • [4] C. Bowles, L. Chen, R. Guerrero, P. Bentley, R. Gunn, A. Hammers, D. A. Dickie, M. V. Hernández, J. Wardlaw, and D. Rueckert. Gan augmentation: Augmenting training data using generative adversarial networks. arXiv, 2018.
  • [5] F. Calimeri, A. Marzullo, C. Stamile, and G. Terracina. Biomedical data augmentation using generative adversarial neural networks. In ICANN, 2017.
  • [6] H. Chen, X. Qi, L. Yu, and P.-A. Heng. Dcan: deep contour-aware networks for accurate gland segmentation. In CVPR, 2016.
  • [7] X. Chen and E. Konukoglu. Unsupervised detection of lesions in brain mri using constrained adversarial auto-encoders. arXiv, 2018.
  • [8] M. Frid-Adar, I. Diamant, E. Klang, M. Amitai, J. Goldberger, and H. Greenspan. Gan-based synthetic medical image augmentation for increased cnn performance in liver lesion classification. arXiv, 2018.
  • [9] M. Frid-Adar, E. Klang, M. Amitai, J. Goldberger, and H. Greenspan. Synthetic data augmentation using gan for improved liver lesion classification. In ISBI, 2018.
  • [10] F. Galbusera, F. Niemeyer, M. Seyfried, T. Bassani, G. Casaroli, A. Kienle, and H.-J. Wilke. Exploring the potential of generative adversarial networks for synthesizing radiological images of the spine to be used in in silico trials. Frontiers in Bioengineering and Biotechnology, 6:53, 2018.
  • [11] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio. Generative adversarial nets. In NIPS, 2014.
  • [12] B. Gumbiner. Cell adhesion: the molecular basis of tissue architecture and morphogenesis. Cell, 84(3):345–357, 1996.
  • [13] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In CVPR, 2016.
  • [14] B. Kayalibay, G. Jensen, and P. van der Smagt. Cnn-based segmentation of medical imaging data. arXiv, 2017.
  • [15] S. Kazeminia, C. Baur, A. Kuijper, B. van Ginneken, N. Navab, S. Albarqouni, and A. Mukhopadhyay. Gans for medical image analysis. arXiv, 2018.
  • [16] S. Kohl, D. Bonekamp, H.-P. Schlemmer, K. Yaqubi, M. Hohenfellner, B. Hadaschik, J.-P. Radtke, and K. Maier-Hein. Adversarial networks for the detection of aggressive prostate cancer. arXiv, 2017.
  • [17] R. Krishna, Y. Zhu, O. Groth, J. Johnson, K. Hata, J. Kravitz, S. Chen, Y. Kalantidis, L.-J. Li, D. A. Shamma, M. Bernstein, and L. Fei-Fei. Visual genome: Connecting language and vision using crowdsourced dense image annotations. 2016.
  • [18] J. Long, E. Shelhamer, and T. Darrell. Fully convolutional networks for semantic segmentation. In CVPR, 2015.
  • [19] P. Moeskops, M. Veta, M. W. Lafarge, K. A. Eppenhof, and J. P. Pluim. Adversarial training and dilated convolutions for brain mri segmentation. In DLMIA. 2017.
  • [20] T. C. Mok and A. C. Chung. Learning data augmentation for brain tumor segmentation with coarse-to-fine generative adversarial networks. arXiv, 2018.
  • [21] T. Neff, C. Payer, D. Štern, and M. Urschler. Generative adversarial network based synthesis for supervised medical image segmentation. In OAGM & ARW, 2017.
  • [22] D. Nie, R. Trullo, J. Lian, C. Petitjean, S. Ruan, Q. Wang, and D. Shen. Medical image synthesis with context-aware generative adversarial networks. In MICCAI, 2017.
  • [23] S. Ren, K. He, R. Girshick, and J. Sun. Faster r-cnn: Towards real-time object detection with region proposal networks. In NIPS, 2015.
  • [24] T. Schlegl, P. Seeböck, S. M. Waldstein, U. Schmidt-Erfurth, and G. Langs.

    Unsupervised anomaly detection with generative adversarial networks to guide marker discovery.

    In IPMI, 2017.
  • [25] A. Sekuboyina, M. Rempfler, J. Kukačka, G. Tetteh, A. Valentinitsch, J. S. Kirschke, and B. H. Menze. Btrfly net: Vertebrae labelling with energy-based adversarial learning of local spine prior. arXiv, 2018.
  • [26] H.-C. Shin, N. A. Tenenholtz, J. K. Rogers, C. G. Schwarz, M. L. Senjem, J. L. Gunter, K. P. Andriole, and M. Michalski. Medical image synthesis for data augmentation and anonymization using generative adversarial networks. In SASHIMI, 2018.
  • [27] K. Sirinukunwattana, J. P. Pluim, H. Chen, X. Qi, P.-A. Heng, Y. B. Guo, L. Y. Wang, B. J. Matuszewski, E. Bruni, U. Sanchez, et al. Gland segmentation in colon histology images: The glas challenge contest. Medical image analysis, 35:489–502, 2017.
  • [28] J. Son, S. J. Park, and K.-H. Jung. Retinal vessel segmentation in fundoscopic images with generative adversarial networks. arXiv, 2017.
  • [29] A. Tuysuzoglu, J. Tan, K. Eissa, A. P. Kiraly, M. Diallo, and A. Kamen. Deep adversarial context-aware landmark detection for ultrasound imaging. arXiv, 2018.
  • [30] T.-C. Wang, M.-Y. Liu, J.-Y. Zhu, A. Tao, J. Kautz, and B. Catanzaro. High-resolution image synthesis and semantic manipulation with conditional gans. In CVPR, 2018.
  • [31] J. M. Wolterink, K. Kamnitsas, C. Ledig, and I. Išgum. Generative adversarial networks and adversarial methods in biomedical image analysis. arXiv, 2018.
  • [32] J. M. Wolterink, T. Leiner, and I. Isgum. Blood vessel geometry synthesis using generative adversarial networks. arXiv, 2018.
  • [33] Y. Xue, T. Xu, H. Zhang, L. R. Long, and X. Huang. Segan: Adversarial network with multi-scale l 1 loss for medical image segmentation. Neuroinformatics, 2018.
  • [34] D. Yang, D. Xu, S. K. Zhou, B. Georgescu, M. Chen, S. Grbic, D. Metaxas, and D. Comaniciu. Automatic liver segmentation using an adversarial image-to-image network. In MICCAI, 2017.
  • [35] J. Yang, J. Lu, D. Batra, and D. Parikh.

    A faster pytorch implementation of faster r-cnn.

    https://github.com/jwyang/faster-rcnn.pytorch, 2017.
  • [36] X. Yi, E. Walia, and P. Babyn. Generative adversarial network in medical imaging: A review. arXiv, 2018.
  • [37] Y. Zhang, L. Yang, J. Chen, M. Fredericksen, D. P. Hughes, and D. Z. Chen. Deep adversarial networks for biomedical image segmentation utilizing unannotated images. In MICCAI, 2017.