PICK: Processing Key Information Extraction from Documents using Improved Graph Learning-Convolutional Networks

04/16/2020
by   Wenwen Yu, et al.
0

Computer vision with state-of-the-art deep learning models has achieved huge success in the field of Optical Character Recognition (OCR) including text detection and recognition tasks recently. However, Key Information Extraction (KIE) from documents as the downstream task of OCR, having a large number of use scenarios in real-world, remains a challenge because documents not only have textual features extracting from OCR systems but also have semantic visual features that are not fully exploited and play a critical role in KIE. Too little work has been devoted to efficiently make full use of both textual and visual features of the documents. In this paper, we introduce PICK, a framework that is effective and robust in handling complex documents layout for KIE by combining graph learning with graph convolution operation, yielding a richer semantic representation containing the textual and visual features and global layout without ambiguity. Extensive experiments on real-world datasets have been conducted to show that our method outperforms baselines methods by significant margins.

READ FULL TEXT
research
03/27/2019

Graph Convolution for Multimodal Information Extraction from Visually Rich Documents

Visually rich documents (VRDs) are ubiquitous in daily business and life...
research
11/07/2021

Information Extraction from Visually Rich Documents with Font Style Embeddings

Information extraction (IE) from documents is an intensive area of resea...
research
09/17/2021

Including Keyword Position in Image-based Models for Act Segmentation of Historical Registers

The segmentation of complex images into semantic regions has seen a grow...
research
08/28/2013

Text recognition in both ancient and cartographic documents

This paper deals with the recognition and matching of text in both carto...
research
12/17/2013

Deep Convolutional Ranking for Multilabel Image Annotation

Multilabel image annotation is one of the most important challenges in c...
research
09/27/2016

House price estimation from visual and textual features

Most existing automatic house price estimation systems rely only on some...
research
07/20/2023

PPN: Parallel Pointer-based Network for Key Information Extraction with Complex Layouts

Key Information Extraction (KIE) is a challenging multimodal task that a...

Please sign up or login with your details

Forgot password? Click here to reset