Judge, Localize, and Edit: Ensuring Visual Commonsense Morality for Text-to-Image Generation

12/07/2022
by   Seongbeom Park, et al.
0

Text-to-image generation methods produce high-resolution and high-quality images, but these methods should not produce immoral images that may contain inappropriate content from the commonsense morality perspective. Conventional approaches often neglect these ethical concerns, and existing solutions are limited in avoiding immoral image generation. In this paper, we aim to automatically judge the immorality of synthesized images and manipulate these images into a moral alternative. To this end, we build a model that has the three main primitives: (1) our model recognizes the visual commonsense immorality of a given image, (2) our model localizes or highlights immoral visual (and textual) attributes that make the image immoral, and (3) our model manipulates a given immoral image into a morally-qualifying alternative. We experiment with the state-of-the-art Stable Diffusion text-to-image generation model and show the effectiveness of our ethical image manipulation. Our human study confirms that ours is indeed able to generate morally-satisfying images from immoral ones. Our implementation will be publicly available upon publication to be widely used as a new safety checker for text-to-image generation models.

READ FULL TEXT

page 2

page 3

page 5

page 6

page 7

page 8

page 11

page 12

research
05/09/2023

SUR-adapter: Enhancing Text-to-Image Pre-trained Diffusion Models with Large Language Models

Diffusion models, which have emerged to become popular text-to-image gen...
research
03/10/2023

New Benchmarks for Accountable Text-based Visual Re-creation

Given a command, humans can directly execute the action after thinking o...
research
08/04/2022

Adversarial Attacks on Image Generation With Made-Up Words

Text-guided image generation models can be prompted to generate images u...
research
03/22/2023

VecFontSDF: Learning to Reconstruct and Synthesize High-quality Vector Fonts via Signed Distance Functions

Font design is of vital importance in the digital content design and mod...
research
01/01/2021

Biologically Inspired Hexagonal Deep Learning for Hexagonal Image Generation

Whereas conventional state-of-the-art image processing systems of record...
research
05/24/2023

Transferring Visual Attributes from Natural Language to Verified Image Generation

Text to image generation methods (T2I) are widely popular in generating ...
research
06/15/2023

Linguistic Binding in Diffusion Models: Enhancing Attribute Correspondence through Attention Map Alignment

Text-conditioned image generation models often generate incorrect associ...

Please sign up or login with your details

Forgot password? Click here to reset