Graph Convolution for Multimodal Information Extraction from Visually Rich Documents

03/27/2019
by   Xiaojing Liu, et al.
0

Visually rich documents (VRDs) are ubiquitous in daily business and life. Examples are purchase receipts, insurance policy documents, custom declaration forms and so on. In VRDs, visual and layout information is critical for document understanding, and texts in such documents cannot be serialized into the one-dimensional sequence without losing information. Classic information extraction models such as BiLSTM-CRF typically operate on text sequences and do not incorporate visual features. In this paper, we introduce a graph convolution based model to combine textual and visual information presented in VRDs. Graph embeddings are trained to summarize the context of a text segment in the document, and further combined with text embeddings for entity extraction. Extensive experiments have been conducted to show that our method outperforms BiLSTM-CRF baselines by significant margins, on two real-world datasets. Additionally, ablation studies are also performed to evaluate the effectiveness of each component of our model.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/16/2020

PICK: Processing Key Information Extraction from Documents using Improved Graph Learning-Convolutional Networks

Computer vision with state-of-the-art deep learning models has achieved ...
research
05/22/2020

Robust Layout-aware IE for Visually Rich Documents with Pre-trained Language Models

Many business documents processed in modern NLP and IR pipelines are vis...
research
09/12/2022

One-Shot Doc Snippet Detection: Powering Search in Document Beyond Text

Active consumption of digital documents has yielded scope for research i...
research
08/23/2021

Using Neighborhood Context to Improve Information Extraction from Visual Documents Captured on Mobile Phones

Information Extraction from visual documents enables convenient and inte...
research
07/01/2021

Automatic Metadata Extraction Incorporating Visual Features from Scanned Electronic Theses and Dissertations

Electronic Theses and Dissertations (ETDs) contain domain knowledge that...
research
07/16/2023

DocTr: Document Transformer for Structured Information Extraction in Documents

We present a new formulation for structured information extraction (SIE)...
research
11/07/2021

Information Extraction from Visually Rich Documents with Font Style Embeddings

Information extraction (IE) from documents is an intensive area of resea...

Please sign up or login with your details

Forgot password? Click here to reset