Document Layout Analysis with Aesthetic-Guided Image Augmentation

11/27/2021
by   Tianlong Ma, et al.
0

Document layout analysis (DLA) plays an important role in information extraction and document understanding. At present, document layout analysis has reached a milestone achievement, however, document layout analysis of non-Manhattan is still a challenge. In this paper, we propose an image layer modeling method to tackle this challenge. To measure the proposed image layer modeling method, we propose a manually-labeled non-Manhattan layout fine-grained segmentation dataset named FPD. As far as we know, FPD is the first manually-labeled non-Manhattan layout fine-grained segmentation dataset. To effectively extract fine-grained features of documents, we propose an edge embedding network named L-E^3Net. Experimental results prove that our proposed image layer modeling method can better deal with the fine-grained segmented document of the non-Manhattan layout.

READ FULL TEXT
research
10/15/2021

Accurate Fine-grained Layout Analysis for the Historical Tibetan Document Based on the Instance Segmentation

Accurate layout analysis without subsequent text-line segmentation remai...
research
08/04/2021

Human-In-The-Loop Document Layout Analysis

Document layout analysis (DLA) aims to divide a document image into diff...
research
05/13/2021

VSR: A Unified Framework for Document Layout Analysis combining Vision, Semantics and Relations

Document layout analysis is crucial for understanding document structure...
research
05/20/2021

Document Domain Randomization for Deep Learning Document Layout Extraction

We present document domain randomization (DDR), the first successful tra...
research
11/16/2021

Multi-Vector Models with Textual Guidance for Fine-Grained Scientific Document Similarity

We present Aspire, a new scientific document similarity model based on m...
research
07/06/2020

Detile: Fine-Grained Information Leak Detection in Script Engines

Memory disclosure attacks play an important role in the exploitation of ...
research
04/07/2021

Document Layout Analysis via Dynamic Residual Feature Fusion

The document layout analysis (DLA) aims to split the document image into...

Please sign up or login with your details

Forgot password? Click here to reset