HiFA: High-fidelity Text-to-3D with Advanced Diffusion Guidance

05/30/2023
by   Junzhe Zhu, et al.
0

Automatic text-to-3D synthesis has achieved remarkable advancements through the optimization of 3D models. Existing methods commonly rely on pre-trained text-to-image generative models, such as diffusion models, providing scores for 2D renderings of Neural Radiance Fields (NeRFs) and being utilized for optimizing NeRFs. However, these methods often encounter artifacts and inconsistencies across multiple views due to their limited understanding of 3D geometry. To address these limitations, we propose a reformulation of the optimization loss using the diffusion prior. Furthermore, we introduce a novel training approach that unlocks the potential of the diffusion prior. To improve 3D geometry representation, we apply auxiliary depth supervision for NeRF-rendered images and regularize the density field of NeRFs. Extensive experiments demonstrate the superiority of our method over prior works, resulting in advanced photo-realism and improved multi-view consistency.

READ FULL TEXT

page 1

page 2

page 3

page 5

page 6

page 10

research
05/19/2023

Text2NeRF: Text-Driven 3D Scene Generation with Neural Radiance Fields

Text-driven 3D scene generation is widely applicable to video gaming, fi...
research
03/24/2023

Make-It-3D: High-Fidelity 3D Creation from A Single Image with Diffusion Prior

In this work, we investigate the problem of creating high-fidelity 3D co...
research
08/31/2023

MVDream: Multi-view Diffusion for 3D Generation

We propose MVDream, a multi-view diffusion model that is able to generat...
research
06/06/2023

DreamSparse: Escaping from Plato's Cave with 2D Frozen Diffusion Model Given Sparse Views

Synthesizing novel view images from a few views is a challenging but pra...
research
03/16/2023

Diffusion-HPC: Generating Synthetic Images with Realistic Humans

Recent text-to-image generative models have exhibited remarkable abiliti...
research
05/04/2023

Multimodal-driven Talking Face Generation, Face Swapping, Diffusion Model

Multimodal-driven talking face generation refers to animating a portrait...
research
05/25/2023

DiffusionShield: A Watermark for Copyright Protection against Generative Diffusion Models

Recently, Generative Diffusion Models (GDMs) have showcased their remark...

Please sign up or login with your details

Forgot password? Click here to reset