Text-to-image Synthesis via Symmetrical Distillation Networks

08/21/2018
by   Mingkuan Yuan, et al.
0

Text-to-image synthesis aims to automatically generate images according to text descriptions given by users, which is a highly challenging task. The main issues of text-to-image synthesis lie in two gaps: the heterogeneous and homogeneous gaps. The heterogeneous gap is between the high-level concepts of text descriptions and the pixel-level contents of images, while the homogeneous gap exists between synthetic image distributions and real image distributions. For addressing these problems, we exploit the excellent capability of generic discriminative models (e.g. VGG19), which can guide the training process of a new generative model on multiple levels to bridge the two gaps. The high-level representations can teach the generative model to extract necessary visual information from text descriptions, which can bridge the heterogeneous gap. The mid-level and low-level representations can lead it to learn structures and details of images respectively, which relieves the homogeneous gap. Therefore, we propose Symmetrical Distillation Networks (SDN) composed of a source discriminative model as "teacher" and a target generative model as "student". The target generative model has a symmetrical structure with the source discriminative model, in order to transfer hierarchical knowledge accessibly. Moreover, we decompose the training process into two stages with different distillation paradigms for promoting the performance of the target generative model. Experiments on two widely-used datasets are conducted to verify the effectiveness of our proposed SDN.

READ FULL TEXT

page 1

page 6

page 7

page 8

research
05/17/2016

Generative Adversarial Text to Image Synthesis

Automatic synthesis of realistic images from text would be interesting a...
research
01/21/2022

Image-to-Video Re-Identification via Mutual Discriminative Knowledge Transfer

The gap in representations between image and video makes Image-to-Video ...
research
06/02/2023

Privacy Distillation: Reducing Re-identification Risk of Multimodal Diffusion Models

Knowledge distillation in neural networks refers to compressing a large ...
research
10/21/2008

Astronomical imaging: The theory of everything

We are developing automated systems to provide homogeneous calibration m...
research
06/01/2023

StableRep: Synthetic Images from Text-to-Image Models Make Strong Visual Representation Learners

We investigate the potential of learning visual representations using sy...
research
04/04/2023

PODIA-3D: Domain Adaptation of 3D Generative Model Across Large Domain Gap Using Pose-Preserved Text-to-Image Diffusion

Recently, significant advancements have been made in 3D generative model...
research
03/17/2023

DiffusionSeg: Adapting Diffusion Towards Unsupervised Object Discovery

Learning from a large corpus of data, pre-trained models have achieved i...

Please sign up or login with your details

Forgot password? Click here to reset