ManiTrans: Entity-Level Text-Guided Image Manipulation via Token-wise Semantic Alignment and Generation

04/09/2022
by   Jianan Wang, et al.
0

Existing text-guided image manipulation methods aim to modify the appearance of the image or to edit a few objects in a virtual or simple scenario, which is far from practical application. In this work, we study a novel task on text-guided image manipulation on the entity level in the real world. The task imposes three basic requirements, (1) to edit the entity consistent with the text descriptions, (2) to preserve the text-irrelevant regions, and (3) to merge the manipulated entity into the image naturally. To this end, we propose a new transformer-based framework based on the two-stage image synthesis method, namely ManiTrans, which can not only edit the appearance of entities but also generate new entities corresponding to the text guidance. Our framework incorporates a semantic alignment module to locate the image regions to be manipulated, and a semantic loss to help align the relationship between the vision and language. We conduct extensive experiments on the real datasets, CUB, Oxford, and COCO datasets to verify that our method can distinguish the relevant and irrelevant regions and achieve more precise and flexible manipulation compared with baseline methods. The project homepage is <https://jawang19.github.io/manitrans>.

READ FULL TEXT

page 2

page 6

page 7

page 8

page 14

page 15

page 16

page 17

research
02/22/2023

Entity-Level Text-Guided Image Manipulation

Existing text-guided image manipulation methods aim to modify the appear...
research
12/12/2019

ManiGAN: Text-Guided Image Manipulation

The goal of our paper is to semantically edit parts of an image to match...
research
11/26/2022

Target-Free Text-guided Image Manipulation

We tackle the problem of target-free text-guided image manipulation, whi...
research
01/25/2023

Towards Arbitrary Text-driven Image Manipulation via Space Alignment

The recent GAN inversion methods have been able to successfully invert t...
research
11/25/2022

Interactive Image Manipulation with Complex Text Instructions

Recently, text-guided image manipulation has received increasing attenti...
research
09/18/2023

Progressive Text-to-Image Diffusion with Soft Latent Direction

In spite of the rapidly evolving landscape of text-to-image generation, ...
research
08/04/2020

Open-Edit: Open-Domain Image Manipulation with Open-Vocabulary Instructions

We propose a novel algorithm, named Open-Edit, which is the first attemp...

Please sign up or login with your details

Forgot password? Click here to reset