General-Purpose OCR Paragraph Identification by Graph Convolutional Neural Networks

01/29/2021
by   Renshen Wang, et al.
29

Paragraphs are an important class of document entities. We propose a new approach for paragraph identification by spatial graph convolutional neural networks (GCN) applied on OCR text boxes. Two steps, namely line splitting and line clustering, are performed to extract paragraphs from the lines in OCR results. Each step uses a beta-skeleton graph constructed from bounding boxes, where the graph edges provide efficient support for graph convolution operations. With only pure layout input features, the GCN model size is 3 4 orders of magnitude smaller compared to R-CNN based models, while achieving comparable or better accuracies on PubLayNet and other datasets. Furthermore, the GCN models show good generalization from synthetic training data to real-world images, and good adaptivity for variable document styles.

READ FULL TEXT

page 2

page 8

page 11

page 12

page 13

research
03/17/2022

Unified Line and Paragraph Detection by Graph Convolutional Networks

We formulate the task of detecting lines and paragraphs in a document in...
research
09/11/2019

Geometric Graph Convolutional Neural Networks

Graph Convolutional Networks (GCNs) have recently become the primary cho...
research
11/05/2022

1-D Convolutional Graph Convolutional Networks for Fault Detection in Distributed Energy Systems

This paper presents a 1-D convolutional graph neural network for fault d...
research
08/22/2022

Doc-GCN: Heterogeneous Graph Convolutional Networks for Document Layout Analysis

Recognizing the layout of unstructured digital documents is crucial when...
research
12/27/2021

Block Modeling-Guided Graph Convolutional Neural Networks

Graph Convolutional Network (GCN) has shown remarkable potential of expl...
research
09/24/2022

CryptoGCN: Fast and Scalable Homomorphically Encrypted Graph Convolutional Network Inference

Recently cloud-based graph convolutional network (GCN) has demonstrated ...
research
06/30/2021

Resilient UAV Swarm Communications with Graph Convolutional Neural Network

In this paper, we study the self-healing problem of unmanned aerial vehi...

Please sign up or login with your details

Forgot password? Click here to reset