Conditional Image Generation with PixelCNN Decoders

by   Aaron van den Oord, et al.

This work explores conditional image generation with a new image density model based on the PixelCNN architecture. The model can be conditioned on any vector, including descriptive labels or tags, or latent embeddings created by other networks. When conditioned on class labels from the ImageNet database, the model is able to generate diverse, realistic scenes representing distinct animals, objects, landscapes and structures. When conditioned on an embedding produced by a convolutional network given a single image of an unseen face, it generates a variety of new portraits of the same person with different facial expressions, poses and lighting conditions. We also show that conditional PixelCNN can serve as a powerful decoder in an image autoencoder. Additionally, the gated convolutional layers in the proposed model improve the log-likelihood of PixelCNN to match the state-of-the-art performance of PixelRNN on ImageNet, with greatly reduced computational cost.


page 7

page 8

page 10

page 11

page 12

page 13


Keep Drawing It: Iterative language-based image generation and editing

Conditional text-to-image generation approaches commonly focus on genera...

Shape-conditioned Image Generation by Learning Latent Appearance Representation from Unpaired Data

Conditional image generation is effective for diverse tasks including tr...

cFineGAN: Unsupervised multi-conditional fine-grained image generation

We propose an unsupervised multi-conditional image generation pipeline: ...

Diverse Single Image Generation with Controllable Global Structure through Self-Attention

Image generation from a single image using generative adversarial networ...

Hierarchical Text-Conditional Image Generation with CLIP Latents

Contrastive models like CLIP have been shown to learn robust representat...

Conditional Spoken Digit Generation with StyleGAN

This paper adapts a StyleGAN model for speech generation with minimal or...

Diverse Image Generation via Self-Conditioned GANs

We introduce a simple but effective unsupervised method for generating r...

Code Repositories


Kaggle Competitions - TensorFlow Speech Recognition Challenge

view repo


A TensorFlow implementation of the gated variant of PixelCNN (Gated PixelCNN) from "Conditional Image Generation with PixelCNN Decoders" (

view repo


Extension of the Gated-Pixel CNN architecture to removal of clouds from aerial imagery

view repo

Please sign up or login with your details

Forgot password? Click here to reset