CLIP2StyleGAN: Unsupervised Extraction of StyleGAN Edit Directions

12/09/2021
by   Rameen Abdal, et al.
12

The success of StyleGAN has enabled unprecedented semantic editing capabilities, on both synthesized and real images. However, such editing operations are either trained with semantic supervision or described using human guidance. In another development, the CLIP architecture has been trained with internet-scale image and text pairings and has been shown to be useful in several zero-shot learning settings. In this work, we investigate how to effectively link the pretrained latent spaces of StyleGAN and CLIP, which in turn allows us to automatically extract semantically labeled edit directions from StyleGAN, finding and naming meaningful edit operations without any additional human guidance. Technically, we propose two novel building blocks; one for finding interesting CLIP directions and one for labeling arbitrary directions in CLIP latent space. The setup does not assume any pre-determined labels and hence we do not require any additional supervised text/attributes to build the editing framework. We evaluate the effectiveness of the proposed method and demonstrate that extraction of disentangled labeled StyleGAN edit directions is indeed possible, and reveals interesting and non-trivial edit directions.

READ FULL TEXT

page 1

page 6

page 7

page 9

page 10

page 13

page 14

page 15

research
03/20/2023

Discovering Interpretable Directions in the Semantic Latent Space of Diffusion Models

Denoising Diffusion Models (DDMs) have emerged as a strong competitor to...
research
07/02/2023

LEDITS: Real Image Editing with DDPM Inversion and Semantic Guidance

Recent large-scale text-guided diffusion models provide powerful image-g...
research
11/21/2022

OrthoGAN: Multifaceted Semantics for Disentangled Face Editing

This paper describes a new technique for finding disentangled semantic d...
research
04/03/2023

Robust Text-driven Image Editing Method that Adaptively Explores Directions in Latent Spaces of StyleGAN and CLIP

Automatic image editing has great demands because of its numerous applic...
research
02/23/2022

Discovering Multiple and Diverse Directions for Cognitive Image Properties

Recent research has shown that it is possible to find interpretable dire...
research
08/02/2021

Learning Domain-Specific Edit Operations from Model Repositories with Frequent Subgraph Mining

Model transformations play a fundamental role in model-driven software d...
research
10/26/2014

A Ternary Non-Commutative Latent Factor Model for Scalable Three-Way Real Tensor Completion

Motivated by large-scale Collaborative-Filtering applications, we presen...

Please sign up or login with your details

Forgot password? Click here to reset