A Large-Scale Outdoor Multi-modal Dataset and Benchmark for Novel View Synthesis and Implicit Scene Reconstruction

01/17/2023
by   Chongshan Lu, et al.
0

Neural Radiance Fields (NeRF) has achieved impressive results in single object scene reconstruction and novel view synthesis, which have been demonstrated on many single modality and single object focused indoor scene datasets like DTU, BMVS, and NeRF Synthetic.However, the study of NeRF on large-scale outdoor scene reconstruction is still limited, as there is no unified outdoor scene dataset for large-scale NeRF evaluation due to expensive data acquisition and calibration costs. In this paper, we propose a large-scale outdoor multi-modal dataset, OMMO dataset, containing complex land objects and scenes with calibrated images, point clouds and prompt annotations. Meanwhile, a new benchmark for several outdoor NeRF-based tasks is established, such as novel view synthesis, surface reconstruction, and multi-modal NeRF. To create the dataset, we capture and collect a large number of real fly-view videos and select high-quality and high-resolution clips from them. Then we design a quality review module to refine images, remove low-quality frames and fail-to-calibrate scenes through a learning-based automatic evaluation plus manual review. Finally, a number of volunteers are employed to add the text descriptions for each scene and key-frame to meet the potential multi-modal requirements in the future. Compared with existing NeRF datasets, our dataset contains abundant real-world urban and natural scenes with various scales, camera trajectories, and lighting conditions. Experiments show that our dataset can benchmark most state-of-the-art NeRF methods on different tasks. We will release the dataset and model weights very soon.

READ FULL TEXT

page 1

page 6

page 8

page 11

page 12

page 13

page 16

page 17

research
07/22/2023

Replay: Multi-modal Multi-view Acted Videos for Casual Holography

We introduce Replay, a collection of multi-view, multi-modal videos of h...
research
12/30/2022

X-MAS: Extremely Large-Scale Multi-Modal Sensor Dataset for Outdoor Surveillance in Real Environments

In robotics and computer vision communities, extensive studies have been...
research
04/18/2020

Learning to Dehaze From Realistic Scene with A Fast Physics Based Dehazing Network

Dehaze is one of the popular computer vision research topics for long. A...
research
02/25/2015

Building with Drones: Accurate 3D Facade Reconstruction using MAVs

Automatic reconstruction of 3D models from images using multi-view Struc...
research
07/01/2020

Future Urban Scenes Generation Through Vehicles Synthesis

In this work we propose a deep learning pipeline to predict the visual f...
research
03/11/2022

Multi-sensor large-scale dataset for multi-view 3D reconstruction

We present a new multi-sensor dataset for 3D surface reconstruction. It ...
research
07/10/2023

Shelving, Stacking, Hanging: Relational Pose Diffusion for Multi-modal Rearrangement

We propose a system for rearranging objects in a scene to achieve a desi...

Please sign up or login with your details

Forgot password? Click here to reset