UniLayout: Taming Unified Sequence-to-Sequence Transformers for Graphic Layout Generation

08/17/2022
by   Zhaoyun Jiang, et al.
0

To satisfy various user needs, different subtasks of graphic layout generation have been explored intensively in recent years. Existing studies usually propose task-specific methods with diverse input-output formats, dedicated model architectures, and different learning methods. However, those specialized approaches make the adaption to unseen subtasks difficult, hinder the knowledge sharing between different subtasks, and are contrary to the trend of devising general-purpose models. In this work, we propose UniLayout, which handles different subtasks for graphic layout generation in a unified manner. First, we uniformly represent diverse inputs and outputs of subtasks as the sequences of tokens. Then, based on the unified sequence format, we naturally leverage an identical encoder-decoder architecture with Transformers for different subtasks. Moreover, based on the above two kinds of unification, we further develop a single model that supports all subtasks concurrently. Experiments on two public datasets demonstrate that while simple, UniLayout significantly outperforms the previous task-specific methods.

READ FULL TEXT

page 14

page 16

page 17

page 18

page 19

page 21

page 24

page 25

research
06/15/2022

A Unified Sequence Interface for Vision Tasks

While language tasks are naturally expressed in a single, unified, model...
research
07/19/2021

Sequence-to-Sequence Piano Transcription with Transformers

Automatic Music Transcription has seen significant progress in recent ye...
research
11/23/2021

Crossing the Format Boundary of Text and Boxes: Towards Unified Vision-Language Modeling

In this paper, we propose UNICORN, a vision-language (VL) model that uni...
research
03/14/2023

LayoutDM: Discrete Diffusion Model for Controllable Layout Generation

Controllable layout generation aims at synthesizing plausible arrangemen...
research
05/16/2023

Sequence-to-Sequence Pre-training with Unified Modality Masking for Visual Document Understanding

This paper presents GenDoc, a general sequence-to-sequence document unde...
research
10/06/2020

Stepwise Extractive Summarization and Planning with Structured Transformers

We propose encoder-centric stepwise models for extractive summarization ...
research
08/23/2021

Regularizing Transformers With Deep Probabilistic Layers

Language models (LM) have grown with non-stop in the last decade, from s...

Please sign up or login with your details

Forgot password? Click here to reset