One-Shot Template Matching for Automatic Document Data Capture

10/22/2019
by   Pranjal Dhakal, et al.
0

In this paper, we propose a novel one-shot template-matching algorithm to automatically capture data from business documents with an aim to minimize manual data entry. Given one annotated document, our algorithm can automatically extract similar data from other documents having the same format. Based on a set of engineered visual and textual features, our method is invariant to changes in position and value. Experiments on a dataset of 595 real invoices demonstrate 86.4

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/12/2022

One-Shot Doc Snippet Detection: Powering Search in Document Beyond Text

Active consumption of digital documents has yielded scope for research i...
research
04/11/2022

Landmarks and Regions: A Robust Approach to Data Extraction

We propose a new approach to extracting data items or field values from ...
research
09/26/2021

One-shot Key Information Extraction from Document with Deep Partial Graph Matching

Automating the Key Information Extraction (KIE) from documents improves ...
research
09/09/2021

Tiny CNN for feature point description for document analysis: approach and dataset

In this paper, we study the problem of feature points description in the...
research
06/19/2019

Unification of Template-Expansion and XML-Validation

The processing of XML documents often includes creation and validation. ...
research
09/01/2021

A Novel Multi-Centroid Template Matching Algorithm and Its Application to Cough Detection

Cough is a major symptom of respiratory-related diseases. There exists a...
research
10/16/2018

A Retrieval Framework and Implementation for Electronic Documents with Similar Layouts

As the number of digital documents requiring investigation increases, it...

Please sign up or login with your details

Forgot password? Click here to reset