Semantically Invariant Text-to-Image Generation

09/27/2018
by   Shagan Sah, et al.
0

Image captioning has demonstrated models that are capable of generating plausible text given input images or videos. Further, recent work in image generation has shown significant improvements in image quality when text is used as a prior. Our work ties these concepts together by creating an architecture that can enable bidirectional generation of images and text. We call this network Multi-Modal Vector Representation (MMVR). Along with MMVR, we propose two improvements to the text conditioned image generation. Firstly, a n-gram metric based cost function is introduced that generalizes the caption with respect to the image. Secondly, multiple semantically similar sentences are shown to help in generating better images. Qualitative and quantitative evaluations demonstrate that MMVR improves upon existing text conditioned image generation results by over 20

READ FULL TEXT

page 3

page 4

research
03/28/2023

Variational Distribution Learning for Unsupervised Text-to-Image Generation

We propose a text-to-image generation algorithm based on deep neural net...
research
02/20/2023

Affect-Conditioned Image Generation

In creativity support and computational co-creativity contexts, the task...
research
07/29/2022

Testing Relational Understanding in Text-Guided Image Generation

Relations are basic building blocks of human cognition. Classic and rece...
research
09/07/2023

T2IW: Joint Text to Image Watermark Generation

Recent developments in text-conditioned image generative models have rev...
research
12/01/2022

Weakly Supervised Annotations for Multi-modal Greeting Cards Dataset

In recent years, there is a growing number of pre-trained models trained...
research
10/19/2022

OCR-VQGAN: Taming Text-within-Image Generation

Synthetic image generation has recently experienced significant improvem...
research
10/16/2022

Large-scale Text-to-Image Generation Models for Visual Artists' Creative Works

Large-scale Text-to-image Generation Models (LTGMs) (e.g., DALL-E), self...

Please sign up or login with your details

Forgot password? Click here to reset