Visualizing and Understanding Generative Adversarial Networks (Extended Abstract)

01/29/2019 ∙ by David Bau, et al. ∙ ibm The Chinese University of Hong Kong 22

Generative Adversarial Networks (GANs) have achieved impressive results for many real-world applications. As an active research topic, many GAN variants have emerged with improvements in sample quality and training stability. However, visualization and understanding of GANs is largely missing. How does a GAN represent our visual world internally? What causes the artifacts in GAN results? How do architectural choices affect GAN learning? Answering such questions could enable us to develop new insights and better models. In this work, we present an analytic framework to visualize and understand GANs at the unit-, object-, and scene-level. We first identify a group of interpretable units that are closely related to concepts with a segmentation-based network dissection method. We quantify the causal effect of interpretable units by measuring the ability of interventions to control objects in the output. Finally, we examine the contextual relationship between these units and their surrounding by inserting the discovered concepts into new images. We show several practical applications enabled by our framework, from comparing internal representations across different layers, models, and datasets, to improving GANs by locating and removing artifact-causing units, to interactively manipulating objects in the scene. We will open source our interactive tools to help researchers and practitioners better understand their models.



There are no comments yet.


page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.


We analyze the internal GAN representations by decomposing the featuremap at a layer into positions and unit channels . To identify a unit with semantic behavior, we upsample and threshold the unit (Figure 1b), and measure how well it matches an object class in the image as identified by a supervised semantic segmentation network (Xiao et al., 2018)


This approach is inspired by the observation that many units in classification networks locate emergent object classes when upsampled and thresholded (Bau et al., 2017). Here, the threshold is chosen to maximize the information quality ratio, that is, the portion of the joint entropy H which is mutual information I (Wijaya, Sarno, and Zulaika, 2017).

To identify a sets of units that cause semantic effects, we intervene in the network by decomposing the featuremap into two parts , and forcing the components on and off:

Original image:


Image with U ablated at pixels P:


Image with U inserted at pixels P:


We measure the average causal effect (ACE) (Holland, 1988) of units U on class as:



Interpretable units for different scene categories

The set of all object classes matched by the units of a GAN provides a map of what a GAN has learned about the data. Figure 2 examines units from generators train on four LSUN (Yu et al., 2015) scene categories. The units that emerge are object classes appropriate to the scene type: for example, when we examine a GAN trained on kitchen scenes, we find units that match stoves, cabinets, and the legs of tall kitchen stools. Another striking phenomenon is that many units represent parts of objects: for example, the conference room GAN contains separate units for the body and head of a person.

Interpretable units for different network layers.

In classifier networks, the type of information explicitly represented changes from layer to layer

(Zeiler and Fergus, 2014). We find a similar phenomenon in a GAN. Figure 3 compares early, middle, and late layers of a progressive GAN with internal convolutional layers. The output of the first convolutional layer, one step away from the input , remains entangled. Mid-level layers to have a large number of units that match semantic objects and object parts. Units in layers and beyond match local pixel patterns such as materials and shapes.

Interpretable units for different GAN models.

Interpretable units can provide insight about how GAN architecture choices affect the structures learned inside a GAN. Figure 4 compares three models (Karras et al., 2018) that introduce two innovations on baseline Progressive GANs. By examining unit semantics, we confirm that providing minibatch stddev statistics to the discriminator increases not only the visible GAN output, but also the diversity of concepts represented by units of a GAN: the number of types of objects, parts, and materials matching units increases by more than . The second architecture applies pixelwise normalization to achieve better training stability. As applied to Progressive GANs, pixelwise normalization increases the number of units that match semantic classes by .

Diagnosing and Improving GANs

Our framework can also analyze the causes of failures in their results. Figure 5a shows several annotated units that are responsible for typical artifacts consistently appearing across different images. Such units can be identified by visualizing ten top-activating images for each unit, and labeling units for which many visible artifacts appear in these images. Human annotation is efficient and it typically takes minutes to locate artifact-causing units out of units in layer4.

More importantly, we can fix these errors by ablating the artifact-causing units. Figure 5b shows that artifacts are successfully removed and the artifact-free pixels stay the same, improving the generated results. To further quantify the improvement, we compute the Fréchet Inception Distance (Heusel et al., 2017) between the generated images and real images using real images and generated images with high activations on these units. We also ask human participants on Amazon MTurk to identify the more realistic image given two images produced by different methods: we collected annotations for images per method. As summarized in Table 1, our framework significantly improves fidelity based on these two metrics.

Locating causal units with ablation

Errors are not the only type of output that can be affected by directly intervening in a GAN. A variety of specific object types can also be removed from GAN output by ablating a set of units in a GAN. In Figure 6 we intervene in sets of 20 units that have causal effects on common object classes in conference rooms scenes. We find that, by turning off small sets of units, most of the output of people, curtains, and windows can be removed from the generated scenes. However, not every object has a simple causal encoding: tables and chairs cannot be removed. Ablating those units will reduce the size and density of these objects, but will rarely eliminate them.

The ease of object removal depends on the scene type. Figure 7 shows that, while windows can be removed well from conference rooms, they are more difficult to remove from other scenes. In particular, windows are as difficult to remove from a bedroom as tables and chairs from a conference room. We hypothesize that the difficulty of removal reflects the level of choice that a GAN has learned for a concept: a conference room is defined by the presence of chairs, so they cannot be removed. And modern building codes mandate that bedrooms must have windows; the GAN seems to have noticed.

Characterizing contextual relationships using insertion

We can also learn about the operation of a GAN by forcing units on and inserting these features into specific locations in scenes. Figure 8 shows the effect of inserting layer4

causal door units in church scenes. In this experiment, we insert units by setting their activation to the mean activation level at locations at which doors are present. Although this intervention is the same in each case, the effects vary widely depending on the context. For example, the doors added to the five buildings in Figure 

8 appear with a diversity of visual attributes, each with an orientation, size, material, and style that matches the building.

We also observe that doors cannot be added in most locations. The locations where a door can be added are highlighted by a yellow box. The bar chart in Figure 8 shows average causal effects of insertions of door units, conditioned on the object class at the location of the intervention. Doors can be created in buildings, but not in trees or in the sky. A particularly good location for inserting a door is one where there is already a window.

Tracing the causal effects of an intervention

To investigate the mechanism for suppressing the visible effects of some interventions, we perform an insertion of 20 door-causal units on a sample of locations and measure the changes in later layer featuremaps caused by interventions at layer 4. To quantify effects on downstream features, and the effect on each each feature channel is normalized by its mean L1 magnitude, and we examine the mean change in these normalized featuremaps at each layer. In Figure 9, these effects that propagate to layer14 are visualized as a heatmap: brighter colors indicate a stronger effect on the final feature layer when the door intervention is in the neighborhood of a building instead of trees or sky. Furthermore, we graph the average effect on every layer at right in Figure 9, separating interventions that have a visible effect from those that do not. A small identical intervention at layer4 is amplified to larger changes up to a peak at layer12.

Interventions provide insight on how a GAN enforces relationships between objects. We find that even if we try to add a door in layer4, that choice can be vetoed by later layers if the object is not appropriate for the context.

Figure 9: Tracing the effect of inserting door units on downstream layers. An identical ”door” intervention at layer4 of each pixel in the featuremap has a different effect on final convolutional feature layer, depending on the location of the intervention. In the heatmap, brighter colors indicate a stronger effect on the layer14 feature. A request for a door has a larger effect in locations of a building, and a smaller effect near trees and sky. At right, the magnitude of feature effects at every layer is shown, measured by mean normalized feature changes. In the line plot, feature changes for interventions that result in human-visible changes are separated from interventions that do not result in noticeable changes in the output.


By carefully examining representation units, we have found that many parts of GAN representations can be interpreted, not only as signals that correlate with object concepts but as variables that have a causal effect on the synthesis of semantic objects in the output. These interpretable effects can be used to compare, debug, modify, and reason about a GAN model.

Prior visualization methods (Zeiler and Fergus, 2014; Bau et al., 2017; Karpathy, Johnson, and Fei-Fei, 2016) have brought many new insights to CNN and RNNs research. Motivated by that, in this work we have taken a small step towards understanding the internal representations of a GAN, and we have uncovered many questions that we cannot yet answer with the current method. For example: why can’t a door be inserted in the sky? How does the GAN suppress the signal in the later layers? Further work will be needed to understand the relationships between layers of a GAN. Nevertheless, we hope that our work can help researchers and practitioners better analyze and develop their own GANs.


  • Bau et al. (2017) Bau, D.; Zhou, B.; Khosla, A.; Oliva, A.; and Torralba, A. 2017. Network dissection: Quantifying interpretability of deep visual representations. In CVPR.
  • Heusel et al. (2017) Heusel, M.; Ramsauer, H.; Unterthiner, T.; Nessler, B.; and Hochreiter, S. 2017. Gans trained by a two time-scale update rule converge to a local nash equilibrium. In NIPS.
  • Holland (1988) Holland, P. W. 1988. Causal inference, path analysis and recursive structural equations models. ETS Research Report Series 1988(1):i–50.
  • Karpathy, Johnson, and Fei-Fei (2016) Karpathy, A.; Johnson, J.; and Fei-Fei, L. 2016. Visualizing and understanding recurrent networks. In ICLR.
  • Karras et al. (2018) Karras, T.; Aila, T.; Laine, S.; and Lehtinen, J. 2018. Progressive growing of gans for improved quality, stability, and variation. In ICLR.
  • Wijaya, Sarno, and Zulaika (2017) Wijaya, D. R.; Sarno, R.; and Zulaika, E. 2017. Information quality ratio as a novel metric for mother wavelet selection. Chemometrics and Intelligent Laboratory Systems 160:59–71.
  • Xiao et al. (2018) Xiao, T.; Liu, Y.; Zhou, B.; Jiang, Y.; and Sun, J. 2018.

    Unified perceptual parsing for scene understanding.

    In ECCV.
  • Yu et al. (2015) Yu, F.; Seff, A.; Zhang, Y.; Song, S.; Funkhouser, T.; and Xiao, J. 2015. Lsun: Construction of a large-scale image dataset using deep learning with humans in the loop. arXiv preprint arXiv:1506.03365.
  • Zeiler and Fergus (2014) Zeiler, M. D., and Fergus, R. 2014. Visualizing and understanding convolutional networks. In ECCV.