UniG3D: A Unified 3D Object Generation Dataset

by   Qinghong Sun, et al.

The field of generative AI has a transformative impact on various areas, including virtual reality, autonomous driving, the metaverse, gaming, and robotics. Among these applications, 3D object generation techniques are of utmost importance. This technique has unlocked fresh avenues in the realm of creating, customizing, and exploring 3D objects. However, the quality and diversity of existing 3D object generation methods are constrained by the inadequacies of existing 3D object datasets, including issues related to text quality, the incompleteness of multi-modal data representation encompassing 2D rendered images and 3D assets, as well as the size of the dataset. In order to resolve these issues, we present UniG3D, a unified 3D object generation dataset constructed by employing a universal data transformation pipeline on Objaverse and ShapeNet datasets. This pipeline converts each raw 3D model into comprehensive multi-modal data representation <text, image, point cloud, mesh> by employing rendering engines and multi-modal models. These modules ensure the richness of textual information and the comprehensiveness of data representation. Remarkably, the universality of our pipeline refers to its ability to be applied to any 3D dataset, as it only requires raw 3D data. The selection of data sources for our dataset is based on their scale and quality. Subsequently, we assess the effectiveness of our dataset by employing Point-E and SDFusion, two widely recognized methods for object generation, tailored to the prevalent 3D representations of point clouds and signed distance functions. Our dataset is available at: https://unig3d.github.io.


page 2

page 6

page 8


PTA-Det: Point Transformer Associating Point cloud and Image for 3D Object Detection

In autonomous driving, 3D object detection based on multi-modal data has...

UniM^2AE: Multi-modal Masked Autoencoders with Unified 3D Representation for 3D Perception in Autonomous Driving

Masked Autoencoders (MAE) play a pivotal role in learning potent represe...

Constructing Multi-Modal Dialogue Dataset by Replacing Text with Semantically Relevant Images

In multi-modal dialogue systems, it is important to allow the use of ima...

Informative Data Selection with Uncertainty for Multi-modal Object Detection

Noise has always been nonnegligible trouble in object detection by creat...

STPLS3D: A Large-Scale Synthetic and Real Aerial Photogrammetry 3D Point Cloud Dataset

Although various 3D datasets with different functions and scales have be...

Shelving, Stacking, Hanging: Relational Pose Diffusion for Multi-modal Rearrangement

We propose a system for rearranging objects in a scene to achieve a desi...

Juggling With Representations: On the Information Transfer Between Imagery, Point Clouds, and Meshes for Multi-Modal Semantics

The automatic semantic segmentation of the huge amount of acquired remot...

Please sign up or login with your details

Forgot password? Click here to reset