Revising FUNSD dataset for key-value detection in document images

10/11/2020
by   Hieu M. Vu, et al.
0

FUNSD is one of the limited publicly available datasets for information extraction from document im-ages. The information in the FUNSD dataset is defined by text areas of four categories ("key", "value", "header", "other", and "background") and connectivity between areas as key-value relations. In-specting FUNSD, we found several inconsistency in labeling, which impeded its applicability to thekey-value extraction problem. In this report, we described some labeling issues in FUNSD and therevision we made to the dataset. We also reported our implementation of for key-value detection onFUNSD using a UNet model as baseline results and an improved UNet model with Channel-InvariantDeformable Convolution.

READ FULL TEXT

page 3

page 4

page 5

03/26/2021

Spatial Dual-Modality Graph Reasoning for Key Information Extraction

Key information extraction from document images is of paramount importan...
09/09/2020

One-shot Text Field Labeling using Attention and Belief Propagation for Structure Information Extraction

Structured information extraction from document images usually consists ...
05/23/2022

Document Intelligence Metrics for Visually Rich Document Evaluation

The processing of Visually-Rich Documents (VRDs) is highly important in ...
06/24/2021

MatchVIE: Exploiting Match Relevancy between Entities for Visual Information Extraction

Visual Information Extraction (VIE) task aims to extract key information...
04/23/2015

x.ent: R Package for Entities and Relations Extraction based on Unsupervised Learning and Document Structure

Relation extraction with accurate precision is still a challenge when pr...
10/25/2016

How Document Pre-processing affects Keyphrase Extraction Performance

The SemEval-2010 benchmark dataset has brought renewed attention to the ...
10/14/2021

Making Document-Level Information Extraction Right for the Right Reasons

Document-level information extraction is a flexible framework compatible...