LayoutNUWA: Revealing the Hidden Layout Expertise of Large Language Models

09/18/2023
by   Zecheng Tang, et al.
0

Graphic layout generation, a growing research field, plays a significant role in user engagement and information perception. Existing methods primarily treat layout generation as a numerical optimization task, focusing on quantitative aspects while overlooking the semantic information of layout, such as the relationship between each layout element. In this paper, we propose LayoutNUWA, the first model that treats layout generation as a code generation task to enhance semantic information and harness the hidden layout expertise of large language models (LLMs). More concretely, we develop a Code Instruct Tuning (CIT) approach comprising three interconnected modules: 1) the Code Initialization (CI) module quantifies the numerical conditions and initializes them as HTML code with strategically placed masks; 2) the Code Completion (CC) module employs the formatting knowledge of LLMs to fill in the masked portions within the HTML code; 3) the Code Rendering (CR) module transforms the completed code into the final layout output, ensuring a highly interpretable and transparent layout generation procedure that directly maps code to a visualized layout. We attain significant state-of-the-art performance (even over 50% improvements) on multiple datasets, showcasing the strong capabilities of LayoutNUWA. Our code is available at https://github.com/ProjectNUWA/LayoutNUWA.

READ FULL TEXT

page 1

page 4

page 15

page 16

page 17

research
06/15/2023

Relation-Aware Diffusion Model for Controllable Poster Layout Generation

Poster layout is a crucial aspect of poster design. Prior methods primar...
research
08/02/2021

Constrained Graphic Layout Generation via Latent Optimization

It is common in graphic design humans visually arrange various elements ...
research
04/09/2020

Spatial Priming for Detecting Human-Object Interactions

The relative spatial layout of a human and an object is an important cue...
research
05/21/2022

NS3: Neuro-Symbolic Semantic Code Search

Semantic code search is the task of retrieving a code snippet given a te...
research
12/12/2019

Automatic Layout Generation with Applications in Machine Learning Engine Evaluation

Machine learning-based lithography hotspot detection has been deeply stu...
research
07/14/2023

TALL: Thumbnail Layout for Deepfake Video Detection

The growing threats of deepfakes to society and cybersecurity have raise...
research
08/29/2021

Layout-to-Image Translation with Double Pooling Generative Adversarial Networks

In this paper, we address the task of layout-to-image translation, which...

Please sign up or login with your details

Forgot password? Click here to reset