Geometry-biased Transformers for Novel View Synthesis

01/11/2023
by   Naveen Venkat, et al.
16

We tackle the task of synthesizing novel views of an object given a few input images and associated camera viewpoints. Our work is inspired by recent 'geometry-free' approaches where multi-view images are encoded as a (global) set-latent representation, which is then used to predict the color for arbitrary query rays. While this representation yields (coarsely) accurate images corresponding to novel viewpoints, the lack of geometric reasoning limits the quality of these outputs. To overcome this limitation, we propose 'Geometry-biased Transformers' (GBTs) that incorporate geometric inductive biases in the set-latent representation-based inference to encourage multi-view geometric consistency. We induce the geometric bias by augmenting the dot-product attention mechanism to also incorporate 3D distances between rays associated with tokens as a learnable bias. We find that this, along with camera-aware embeddings as input, allows our models to generate significantly more accurate outputs. We validate our approach on the real-world CO3D dataset, where we train our system over 10 categories and evaluate its view-synthesis ability for novel objects as well as unseen categories. We empirically validate the benefits of the proposed geometric biases and show that our approach significantly improves over prior works.

READ FULL TEXT

page 1

page 6

page 8

page 14

page 15

page 16

page 17

page 18

research
03/24/2023

GM-NeRF: Learning Generalizable Model-based Neural Radiance Fields from Multi-view Images

In this work, we focus on synthesizing high-fidelity novel view images f...
research
04/13/2023

ShapeClipper: Scalable 3D Shape Learning from Single-View Images via Geometric and CLIP-based Consistency

We present ShapeClipper, a novel method that reconstructs 3D object shap...
research
07/25/2019

Simultaneous multi-view instance detection with learned geometric soft-constraints

We propose to jointly learn multi-view geometry and warping between view...
research
07/18/2022

Geometry-Aware Reference Synthesis for Multi-View Image Super-Resolution

Recent multi-view multimedia applications struggle between high-resoluti...
research
06/27/2020

On the generalization of learning-based 3D reconstruction

State-of-the-art learning-based monocular 3D reconstruction methods lear...
research
04/15/2021

Geometry-Free View Synthesis: Transformers and no 3D Priors

Is a geometric model required to synthesize novel views from a single im...
research
11/28/2022

A Light Touch Approach to Teaching Transformers Multi-view Geometry

Transformers are powerful visual learners, in large part due to their co...

Please sign up or login with your details

Forgot password? Click here to reset