Log In Sign Up

Common Objects in 3D: Large-Scale Learning and Evaluation of Real-life 3D Category Reconstruction

by   Jeremy Reizenstein, et al.

Traditional approaches for learning 3D object categories have been predominantly trained and evaluated on synthetic datasets due to the unavailability of real 3D-annotated category-centric data. Our main goal is to facilitate advances in this field by collecting real-world data in a magnitude similar to the existing synthetic counterparts. The principal contribution of this work is thus a large-scale dataset, called Common Objects in 3D, with real multi-view images of object categories annotated with camera poses and ground truth 3D point clouds. The dataset contains a total of 1.5 million frames from nearly 19,000 videos capturing objects from 50 MS-COCO categories and, as such, it is significantly larger than alternatives both in terms of the number of categories and objects. We exploit this new dataset to conduct one of the first large-scale "in-the-wild" evaluations of several new-view-synthesis and category-centric 3D reconstruction methods. Finally, we contribute NerFormer - a novel neural rendering method that leverages the powerful Transformer to reconstruct an object given a small number of its views. The CO3D dataset is available at .


page 1

page 3

page 4

page 5

page 6

page 8

page 9


ABO: Dataset and Benchmarks for Real-World 3D Object Understanding

We introduce Amazon-Berkeley Objects (ABO), a new large-scale dataset of...

ObjectFolder 2.0: A Multisensory Object Dataset for Sim2Real Transfer

Objects play a crucial role in our everyday activities. Though multisens...

Common Pets in 3D: Dynamic New-View Synthesis of Real-Life Deformable Categories

Obtaining photorealistic reconstructions of objects from sparse views is...

Unsupervised Learning of 3D Object Categories from Videos in the Wild

Our goal is to learn a deep network that, given a small number of images...

Obj2Seq: Formatting Objects as Sequences with Class Prompt for Visual Tasks

Visual tasks vary a lot in their output formats and concerned contents, ...

FSS-1000: A 1000-Class Dataset for Few-Shot Segmentation

Over the past few years, we have witnessed the success of deep learning ...

RelPose: Predicting Probabilistic Relative Rotation for Single Objects in the Wild

We describe a data-driven method for inferring the camera viewpoints giv...

Code Repositories


Tooling for the Common Objects In 3D dataset.

view repo