Exploring Sparse MoE in GANs for Text-conditioned Image Synthesis

09/07/2023
by   Jiapeng Zhu, et al.
0

Due to the difficulty in scaling up, generative adversarial networks (GANs) seem to be falling from grace on the task of text-conditioned image synthesis. Sparsely-activated mixture-of-experts (MoE) has recently been demonstrated as a valid solution to training large-scale models with limited computational resources. Inspired by such a philosophy, we present Aurora, a GAN-based text-to-image generator that employs a collection of experts to learn feature processing, together with a sparse router to help select the most suitable expert for each feature point. To faithfully decode the sampling stochasticity and the text condition to the final synthesis, our router adaptively makes its decision by taking into account the text-integrated global latent code. At 64x64 image resolution, our model trained on LAION2B-en and COYO-700M achieves 6.2 zero-shot FID on MS COCO. We release the code and checkpoints to facilitate the community for further development.

READ FULL TEXT

page 1

page 4

page 5

page 6

page 7

research
07/06/2022

Text to Image Synthesis using Stacked Conditional Variational Autoencoders and Conditional Generative Adversarial Networks

Synthesizing a realistic image from textual description is a major chall...
research
01/23/2023

StyleGAN-T: Unlocking the Power of GANs for Fast Large-Scale Text-to-Image Synthesis

Text-to-image synthesis has recently seen significant progress thanks to...
research
09/02/2021

FA-GAN: Feature-Aware GAN for Text to Image Synthesis

Text-to-image synthesis aims to generate a photo-realistic image from a ...
research
05/16/2019

On Conditioning GANs to Hierarchical Ontologies

The recent success of Generative Adversarial Networks (GAN) is a result ...
research
12/09/2021

Multimodal Conditional Image Synthesis with Product-of-Experts GANs

Existing conditional image synthesis frameworks generate images based on...
research
03/04/2021

Anycost GANs for Interactive Image Synthesis and Editing

Generative adversarial networks (GANs) have enabled photorealistic image...
research
04/22/2022

Recurrent Affine Transformation for Text-to-image Synthesis

Text-to-image synthesis aims to generate natural images conditioned on t...

Please sign up or login with your details

Forgot password? Click here to reset