LaTeRF: Label and Text Driven Object Radiance Fields

07/04/2022
by   Ashkan Mirzaei, et al.
10

Obtaining 3D object representations is important for creating photo-realistic simulators and collecting assets for AR/VR applications. Neural fields have shown their effectiveness in learning a continuous volumetric representation of a scene from 2D images, but acquiring object representations from these models with weak supervision remains an open challenge. In this paper we introduce LaTeRF, a method for extracting an object of interest from a scene given 2D images of the entire scene and known camera poses, a natural language description of the object, and a small number of point-labels of object and non-object points in the input images. To faithfully extract the object from the scene, LaTeRF extends the NeRF formulation with an additional `objectness' probability at each 3D point. Additionally, we leverage the rich latent space of a pre-trained CLIP model combined with our differentiable object renderer, to inpaint the occluded parts of the object. We demonstrate high-fidelity object extraction on both synthetic and real datasets and justify our design choices through an extensive ablation study.

READ FULL TEXT

page 20

page 21

research
12/02/2021

Zero-Shot Text-Guided Object Generation with Dream Fields

We combine neural rendering with multi-modal image and text representati...
research
05/19/2023

Text2NeRF: Text-Driven 3D Scene Generation with Neural Radiance Fields

Text-driven 3D scene generation is widely applicable to video gaming, fi...
research
07/05/2020

GRAF: Generative Radiance Fields for 3D-Aware Image Synthesis

While 2D generative adversarial networks have enabled high-resolution im...
research
10/13/2022

ImaginaryNet: Learning Object Detectors without Real Images and Annotations

Without the demand of training in reality, humans can easily detect a kn...
research
12/01/2022

ViewNeRF: Unsupervised Viewpoint Estimation Using Category-Level Neural Radiance Fields

We introduce ViewNeRF, a Neural Radiance Field-based viewpoint estimatio...
research
09/11/2023

Can you text what is happening? Integrating pre-trained language encoders into trajectory prediction models for autonomous driving

In autonomous driving tasks, scene understanding is the first step towar...

Please sign up or login with your details

Forgot password? Click here to reset