CLIP-NeRF: Text-and-Image Driven Manipulation of Neural Radiance Fields

12/09/2021
by   Can Wang, et al.
4

We present CLIP-NeRF, a multi-modal 3D object manipulation method for neural radiance fields (NeRF). By leveraging the joint language-image embedding space of the recent Contrastive Language-Image Pre-Training (CLIP) model, we propose a unified framework that allows manipulating NeRF in a user-friendly way, using either a short text prompt or an exemplar image. Specifically, to combine the novel view synthesis capability of NeRF and the controllable manipulation ability of latent representations from generative models, we introduce a disentangled conditional NeRF architecture that allows individual control over both shape and appearance. This is achieved by performing the shape conditioning via applying a learned deformation field to the positional encoding and deferring color conditioning to the volumetric rendering stage. To bridge this disentangled latent representation to the CLIP embedding, we design two code mappers that take a CLIP embedding as input and update the latent codes to reflect the targeted editing. The mappers are trained with a CLIP-based matching loss to ensure the manipulation accuracy. Furthermore, we propose an inverse optimization method that accurately projects an input image to the latent codes for manipulation to enable editing on real images. We evaluate our approach by extensive experiments on a variety of text prompts and exemplar images and also provide an intuitive interface for interactive editing. Our implementation is available at https://cassiepython.github.io/clipnerf/

READ FULL TEXT

page 3

page 6

page 8

page 9

page 10

page 11

research
12/09/2021

HairCLIP: Design Your Hair by Text and Reference Image

Hair editing is an interesting and challenging problem in computer visio...
research
11/26/2021

Predict, Prevent, and Evaluate: Disentangled Text-Driven Image Manipulation Empowered by Pre-Trained Vision-Language Model

To achieve disentangled image manipulation, previous works depend heavil...
research
04/28/2022

AE-NeRF: Auto-Encoding Neural Radiance Fields for 3D-Aware Object Manipulation

We propose a novel framework for 3D-aware object manipulation, called Au...
research
08/21/2023

Patternshop: Editing Point Patterns by Image Manipulation

Point patterns are characterized by their density and correlation. While...
research
07/06/2022

Towards Counterfactual Image Manipulation via CLIP

Leveraging StyleGAN's expressivity and its disentangled latent codes, ex...
research
07/24/2022

Cross-Modal 3D Shape Generation and Manipulation

Creating and editing the shape and color of 3D objects require tremendou...

Please sign up or login with your details

Forgot password? Click here to reset