NaviNeRF: NeRF-based 3D Representation Disentanglement by Latent Semantic Navigation

04/22/2023
by   Baao Xie, et al.
0

3D representation disentanglement aims to identify, decompose, and manipulate the underlying explanatory factors of 3D data, which helps AI fundamentally understand our 3D world. This task is currently under-explored and poses great challenges: (i) the 3D representations are complex and in general contains much more information than 2D image; (ii) many 3D representations are not well suited for gradient-based optimization, let alone disentanglement. To address these challenges, we use NeRF as a differentiable 3D representation, and introduce a self-supervised Navigation to identify interpretable semantic directions in the latent space. To our best knowledge, this novel method, dubbed NaviNeRF, is the first work to achieve fine-grained 3D disentanglement without any priors or supervisions. Specifically, NaviNeRF is built upon the generative NeRF pipeline, and equipped with an Outer Navigation Branch and an Inner Refinement Branch. They are complementary – the outer navigation is to identify global-view semantic directions, and the inner refinement dedicates to fine-grained attributes. A synergistic loss is further devised to coordinate two branches. Extensive experiments demonstrate that NaviNeRF has a superior fine-grained 3D disentanglement ability than the previous 3D-aware models. Its performance is also comparable to editing-oriented models relying on semantic or geometry priors.

READ FULL TEXT

page 1

page 6

page 7

page 8

research
06/04/2019

Geo-Aware Networks for Fine Grained Recognition

Fine grained recognition distinguishes among categories with subtle visu...
research
05/10/2023

Self-Supervised Video Representation Learning via Latent Time Navigation

Self-supervised video representation learning aimed at maximizing simila...
research
07/02/2022

Learning Cross-Image Object Semantic Relation in Transformer for Few-Shot Fine-Grained Image Classification

Few-shot fine-grained learning aims to classify a query image into one o...
research
07/25/2023

Kefa: A Knowledge Enhanced and Fine-grained Aligned Speaker for Navigation Instruction Generation

We introduce a novel speaker model Kefa for navigation instruction gener...
research
04/21/2020

Fine-Grained Expression Manipulation via Structured Latent Space

Fine-grained facial expression manipulation is a challenging problem, as...
research
07/13/2020

Fine-Grained Crowd Counting

Current crowd counting algorithms are only concerned about the number of...
research
06/21/2021

PIANO: A Parametric Hand Bone Model from Magnetic Resonance Imaging

Hand modeling is critical for immersive VR/AR, action understanding, or ...

Please sign up or login with your details

Forgot password? Click here to reset