Doc-GCN: Heterogeneous Graph Convolutional Networks for Document Layout Analysis

08/22/2022
by   Siwen Luo, et al.
0

Recognizing the layout of unstructured digital documents is crucial when parsing the documents into the structured, machine-readable format for downstream applications. Recent studies in Document Layout Analysis usually rely on computer vision models to understand documents while ignoring other information, such as context information or relation of document components, which are vital to capture. Our Doc-GCN presents an effective way to harmonize and integrate heterogeneous aspects for Document Layout Analysis. We first construct graphs to explicitly describe four main aspects, including syntactic, semantic, density, and appearance/visual information. Then, we apply graph convolutional networks for representing each aspect of information and use pooling to integrate them. Finally, we aggregate each aspect and feed them into 2-layer MLPs for document layout component classification. Our Doc-GCN achieves new state-of-the-art results in three widely used DLA datasets.

READ FULL TEXT

page 8

page 12

page 14

page 15

page 16

research
08/16/2019

PubLayNet: largest dataset ever for document layout analysis

Recognizing the layout of unstructured digital documents is an important...
research
06/10/2023

Modeling Structural Similarities between Documents for Coherence Assessment with Graph Convolutional Networks

Coherence is an important aspect of text quality, and various approaches...
research
08/03/2023

A Graphical Approach to Document Layout Analysis

Document layout analysis (DLA) is the task of detecting the distinct, se...
research
05/13/2021

VSR: A Unified Framework for Document Layout Analysis combining Vision, Semantics and Relations

Document layout analysis is crucial for understanding document structure...
research
02/01/2019

Dating Documents using Graph Convolution Networks

Document date is essential for many important tasks, such as document re...
research
01/29/2021

General-Purpose OCR Paragraph Identification by Graph Convolutional Neural Networks

Paragraphs are an important class of document entities. We propose a new...
research
08/29/2018

Question Answering by Reasoning Across Documents with Graph Convolutional Networks

Most research in reading comprehension has focused on answering question...

Please sign up or login with your details

Forgot password? Click here to reset