Text2Light: Zero-Shot Text-Driven HDR Panorama Generation

09/20/2022
by   Zhaoxi Chen, et al.
56

High-quality HDRIs(High Dynamic Range Images), typically HDR panoramas, are one of the most popular ways to create photorealistic lighting and 360-degree reflections of 3D scenes in graphics. Given the difficulty of capturing HDRIs, a versatile and controllable generative model is highly desired, where layman users can intuitively control the generation process. However, existing state-of-the-art methods still struggle to synthesize high-quality panoramas for complex scenes. In this work, we propose a zero-shot text-driven framework, Text2Light, to generate 4K+ resolution HDRIs without paired training data. Given a free-form text as the description of the scene, we synthesize the corresponding HDRI with two dedicated steps: 1) text-driven panorama generation in low dynamic range(LDR) and low resolution, and 2) super-resolution inverse tone mapping to scale up the LDR panorama both in resolution and dynamic range. Specifically, to achieve zero-shot text-driven panorama generation, we first build dual codebooks as the discrete representation for diverse environmental textures. Then, driven by the pre-trained CLIP model, a text-conditioned global sampler learns to sample holistic semantics from the global codebook according to the input text. Furthermore, a structure-aware local sampler learns to synthesize LDR panoramas patch-by-patch, guided by holistic semantics. To achieve super-resolution inverse tone mapping, we derive a continuous representation of 360-degree imaging from the LDR panorama as a set of structured latent codes anchored to the sphere. This continuous representation enables a versatile module to upscale the resolution and dynamic range simultaneously. Extensive experiments demonstrate the superior capability of Text2Light in generating high-quality HDR panoramas. In addition, we show the feasibility of our work in realistic rendering and immersive VR.

READ FULL TEXT

page 1

page 3

page 8

page 10

page 12

page 13

page 14

page 15

research
05/31/2022

Text2Human: Text-Driven Controllable Human Image Generation

Generating high-quality and diverse human images is an important yet cha...
research
09/16/2019

TextSR: Content-Aware Text Super-Resolution Guided by Recognition

Scene text recognition has witnessed rapid development with the advance ...
research
06/06/2023

Ada-TTA: Towards Adaptive High-Quality Text-to-Talking Avatar Synthesis

We are interested in a novel task, namely low-resource text-to-talking a...
research
03/28/2023

CuNeRF: Cube-Based Neural Radiance Field for Zero-Shot Medical Image Arbitrary-Scale Super Resolution

Medical image arbitrary-scale super-resolution (MIASSR) has recently gai...
research
08/14/2022

Global Priors Guided Modulation Network for Joint Super-Resolution and Inverse Tone-Mapping

Joint super-resolution and inverse tone-mapping (SR-ITM) aims to enhance...
research
03/02/2023

Zero-Shot Text-to-Parameter Translation for Game Character Auto-Creation

Recent popular Role-Playing Games (RPGs) saw the great success of charac...
research
02/11/2022

Unsupervised HDR Imaging: What Can Be Learned from a Single 8-bit Video?

Recently, Deep Learning-based methods for inverse tone-mapping standard ...

Please sign up or login with your details

Forgot password? Click here to reset