Testing Relational Understanding in Text-Guided Image Generation

07/29/2022
by   Colin Conwell, et al.
0

Relations are basic building blocks of human cognition. Classic and recent work suggests that many relations are early developing, and quickly perceived. Machine models that aspire to human-level perception and reasoning should reflect the ability to recognize and reason generatively about relations. We report a systematic empirical examination of a recent text-guided image generation model (DALL-E 2), using a set of 15 basic physical and social relations studied or proposed in the literature, and judgements from human participants (N = 169). Overall, we find that only  22 relation prompts. Based on a quantitative examination of people's judgments, we suggest that current image generation models do not yet have a grasp of even basic relations involving simple objects and agents. We examine reasons for model successes and failures, and suggest possible improvements based on computations observed in biological intelligence.

READ FULL TEXT

page 3

page 6

page 8

research
09/27/2018

Semantically Invariant Text-to-Image Generation

Image captioning has demonstrated models that are capable of generating ...
research
02/08/2022

DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generative Transformers

Generating images from textual descriptions has gained a lot of attentio...
research
07/15/2020

Graph-Based Social Relation Reasoning

Human beings are fundamentally sociable – that we generally organize our...
research
06/05/2023

Composition and Deformance: Measuring Imageability with a Text-to-Image Model

Although psycholinguists and psychologists have long studied the tendenc...
research
09/02/2023

RenAIssance: A Survey into AI Text-to-Image Generation in the Era of Large Model

Text-to-image generation (TTI) refers to the usage of models that could ...
research
01/01/2021

Biologically Inspired Hexagonal Deep Learning for Hexagonal Image Generation

Whereas conventional state-of-the-art image processing systems of record...
research
08/25/2023

WorldSmith: Iterative and Expressive Prompting for World Building with a Generative AI

Crafting a rich and unique environment is crucial for fictional world-bu...

Please sign up or login with your details

Forgot password? Click here to reset