READ: Recursive Autoencoders for Document Layout Generation

09/01/2019
by   Akshay Gadi Patil, et al.
0

Layout is a fundamental component of any graphic design. Creating large varieties of plausible document layouts can be a tedious task, requiring numerous constraints to be satisfied, including local ones relating different semantic elements and global constraints on the general appearance and spacing. In this paper, we present a novel framework, coined READ, for REcursive Autoencoders for Document layout generation, to generate plausible 2D layouts of documents in large quantities and varieties. First, we devise an exploratory recursive method to extract a structural decomposition of a single document. Leveraging a dataset of documents annotated with labeled bounding boxes, our recursive neural network learns to map the structural representation, given in the form of a simple hierarchy, to a compact code, the space of which is approximated by a Gaussian distribution. Novel hierarchies can be sampled from this space, obtaining new document layouts. Moreover, we introduce a combinatorial metric to measure structural similarity among document layouts. We deploy it to show that our method is able to generate highly variable and realistic layouts. We further demonstrate the utility of our generated layouts in the context of standard detection tasks on documents, showing that detection performance improves when the training data is augmented with generated documents whose layouts are produced by READ.

READ FULL TEXT

page 7

page 12

page 17

research
07/06/2021

DocSynth: A Layout Guided Approach for Controllable Document Image Synthesis

Despite significant progress on current state-of-the-art image generatio...
research
09/24/2018

Chargrid: Towards Understanding 2D Documents

We introduce a novel type of text representation that preserves the 2D l...
research
06/25/2020

Layout Generation and Completion with Self-attention

We address the problem of layout generation for diverse domains such as ...
research
11/11/2021

Synthetic Document Generator for Annotation-free Layout Recognition

Analyzing the layout of a document to identify headers, sections, tables...
research
03/19/2023

Diffusion-based Document Layout Generation

We develop a diffusion-based approach for various document layout sequen...
research
05/05/2017

GRASS: Generative Recursive Autoencoders for Shape Structures

We introduce a novel neural network architecture for encoding and synthe...
research
01/28/2021

DOC2PPT: Automatic Presentation Slides Generation from Scientific Documents

Creating presentation materials requires complex multimodal reasoning sk...

Please sign up or login with your details

Forgot password? Click here to reset