Dream3D: Zero-Shot Text-to-3D Synthesis Using 3D Shape Prior and Text-to-Image Diffusion Models

12/28/2022
by   Jiale Xu, et al.
0

Recent CLIP-guided 3D optimization methods, e.g., DreamFields and PureCLIPNeRF achieve great success in zero-shot text-guided 3D synthesis. However, due to the scratch training and random initialization without any prior knowledge, these methods usually fail to generate accurate and faithful 3D structures that conform to the corresponding text. In this paper, we make the first attempt to introduce the explicit 3D shape prior to CLIP-guided 3D optimization methods. Specifically, we first generate a high-quality 3D shape from input texts in the text-to-shape stage as the 3D shape prior. We then utilize it as the initialization of a neural radiance field and then optimize it with the full prompt. For the text-to-shape generation, we present a simple yet effective approach that directly bridges the text and image modalities with a powerful text-to-image diffusion model. To narrow the style domain gap between images synthesized by the text-to-image model and shape renderings used to train the image-to-shape generator, we further propose to jointly optimize a learnable text prompt and fine-tune the text-to-image diffusion model for rendering-style image generation. Our method, namely, Dream3D, is capable of generating imaginative 3D content with better visual quality and shape accuracy than state-of-the-art methods.

READ FULL TEXT

page 3

page 7

page 14

page 17

page 18

page 19

research
05/25/2023

ZeroAvatar: Zero-shot 3D Avatar Generation from a Single Image

Recent advancements in text-to-image generation have enabled significant...
research
02/07/2023

Hard Prompts Made Easy: Gradient-Based Discrete Optimization for Prompt Tuning and Discovery

The strength of modern generative models lies in their ability to be con...
research
04/13/2022

Hierarchical Text-Conditional Image Generation with CLIP Latents

Contrastive models like CLIP have been shown to learn robust representat...
research
06/14/2023

ZeroForge: Feedforward Text-to-Shape Without 3D Supervision

Current state-of-the-art methods for text-to-shape generation either req...
research
03/24/2022

Text to Mesh Without 3D Supervision Using Limit Subdivision

We present a technique for zero-shot generation of a 3D model using only...
research
03/13/2023

ODIN: On-demand Data Formulation to Mitigate Dataset Lock-in

ODIN is an innovative approach that addresses the problem of dataset con...
research
08/07/2023

DiffSynth: Latent In-Iteration Deflickering for Realistic Video Synthesis

In recent years, diffusion models have emerged as the most powerful appr...

Please sign up or login with your details

Forgot password? Click here to reset