Factored Neural Representation for Scene Understanding

04/21/2023
by   Yu-Shiang Wong, et al.
0

A long-standing goal in scene understanding is to obtain interpretable and editable representations that can be directly constructed from a raw monocular RGB-D video, without requiring specialized hardware setup or priors. The problem is significantly more challenging in the presence of multiple moving and/or deforming objects. Traditional methods have approached the setup with a mix of simplifications, scene priors, pretrained templates, or known deformation models. The advent of neural representations, especially neural implicit representations and radiance fields, opens the possibility of end-to-end optimization to collectively capture geometry, appearance, and object motion. However, current approaches produce global scene encoding, assume multiview capture with limited or no motion in the scenes, and do not facilitate easy manipulation beyond novel view synthesis. In this work, we introduce a factored neural scene representation that can directly be learned from a monocular RGB-D video to produce object-level neural presentations with an explicit encoding of object movement (e.g., rigid trajectory) and/or deformations (e.g., nonrigid movement). We evaluate ours against a set of neural approaches on both synthetic and real data to demonstrate that the representation is efficient, interpretable, and editable (e.g., change object trajectory). The project webpage is available at: $\href{https://yushiangw.github.io/factorednerf/}{\text{link}}$.

READ FULL TEXT

page 1

page 4

page 6

page 7

page 8

page 9

page 10

research
06/16/2022

Unbiased 4D: Monocular 4D Reconstruction with a Neural Deformation Model

Capturing general deforming scenes is crucial for many computer graphics...
research
06/04/2019

Scene Representation Networks: Continuous 3D-Structure-Aware Neural Scene Representations

The advent of deep learning has given rise to neural scene representatio...
research
12/22/2020

STaR: Self-supervised Tracking and Reconstruction of Rigid Objects in Motion with Neural Rendering

We present STaR, a novel method that performs Self-supervised Tracking a...
research
04/14/2018

Motion-based Object Segmentation based on Dense RGB-D Scene Flow

Given two consecutive RGB-D images, we propose a model that estimates a ...
research
04/24/2023

Total-Recon: Deformable Scene Reconstruction for Embodied View Synthesis

We explore the task of embodied view synthesis from monocular videos of ...
research
03/25/2023

NeRF-DS: Neural Radiance Fields for Dynamic Specular Objects

Dynamic Neural Radiance Field (NeRF) is a powerful algorithm capable of ...
research
12/26/2022

MonoNeRF: Learning a Generalizable Dynamic Radiance Field from Monocular Videos

In this paper, we target at the problem of learning a generalizable dyna...

Please sign up or login with your details

Forgot password? Click here to reset