Simultaneous Multiple-Prompt Guided Generation Using Differentiable Optimal Transport

04/18/2022
by   Yingtao Tian, et al.
2

Recent advances in deep learning, such as powerful generative models and joint text-image embeddings, have provided the computational creativity community with new tools, opening new perspectives for artistic pursuits. Text-to-image synthesis approaches that operate by generating images from text cues provide a case in point. These images are generated with a latent vector that is progressively refined to agree with text cues. To do so, patches are sampled within the generated image, and compared with the text prompts in the common text-image embedding space; The latent vector is then updated, using gradient descent, to reduce the mean (average) distance between these patches and text cues. While this approach provides artists with ample freedom to customize the overall appearance of images, through their choice in generative models, the reliance on a simple criterion (mean of distances) often causes mode collapse: The entire image is drawn to the average of all text cues, thereby losing their diversity. To address this issue, we propose using matching techniques found in the optimal transport (OT) literature, resulting in images that are able to reflect faithfully a wide diversity of prompts. We provide numerous illustrations showing that OT avoids some of the pitfalls arising from estimating vectors with mean distances, and demonstrate the capacity of our proposed method to perform better in experiments, qualitatively and quantitatively.

READ FULL TEXT

page 1

page 4

page 5

page 7

research
11/16/2018

Entropy-regularized Optimal Transport Generative Models

We investigate the use of entropy-regularized optimal transport (EOT) co...
research
02/10/2021

On the Existence of Optimal Transport Gradient for Learning Generative Models

The use of optimal transport cost for learning generative models has bec...
research
12/01/2020

Refining Deep Generative Models via Wasserstein Gradient Flows

Deep generative modeling has seen impressive advances in recent years, t...
research
09/09/2023

Comparing Morse Complexes Using Optimal Transport: An Experimental Study

Morse complexes and Morse-Smale complexes are topological descriptors po...
research
10/13/2020

Random Network Distillation as a Diversity Metric for Both Image and Text Generation

Generative models are increasingly able to produce remarkably high quali...
research
04/15/2022

Unconditional Image-Text Pair Generation with Multimodal Cross Quantizer

Though deep generative models have gained a lot of attention, most of th...
research
11/17/2022

Is the Elephant Flying? Resolving Ambiguities in Text-to-Image Generative Models

Natural language often contains ambiguities that can lead to misinterpre...

Please sign up or login with your details

Forgot password? Click here to reset