Visual Conceptual Blending with Large-scale Language and Vision Models

06/27/2021
by   Songwei Ge, et al.
0

We ask the question: to what extent can recent large-scale language and image generation models blend visual concepts? Given an arbitrary object, we identify a relevant object and generate a single-sentence description of the blend of the two using a language model. We then generate a visual depiction of the blend using a text-based image generation model. Quantitative and qualitative evaluations demonstrate the superiority of language models over classical methods for conceptual blending, and of recent large-scale image generation models over prior models for the visual depiction.

READ FULL TEXT

page 1

page 3

research
07/14/2023

GenAssist: Making Image Generation Accessible

Blind and low vision (BLV) creators use images to communicate with sight...
research
11/07/2022

Easily Accessible Text-to-Image Generation Amplifies Demographic Stereotypes at Large Scale

Machine learning models are now able to convert user-written text descri...
research
08/04/2022

Adversarial Attacks on Image Generation With Made-Up Words

Text-guided image generation models can be prompted to generate images u...
research
10/16/2022

Large-scale Text-to-Image Generation Models for Visual Artists' Creative Works

Large-scale Text-to-image Generation Models (LTGMs) (e.g., DALL-E), self...
research
11/28/2022

Hand-Object Interaction Image Generation

In this work, we are dedicated to a new task, i.e., hand-object interact...
research
06/02/2021

Metaphor Generation with Conceptual Mappings

Generating metaphors is a difficult task as it requires understanding nu...
research
05/24/2023

Transferring Visual Attributes from Natural Language to Verified Image Generation

Text to image generation methods (T2I) are widely popular in generating ...

Please sign up or login with your details

Forgot password? Click here to reset