Tiny CNN for feature point description for document analysis: approach and dataset

09/09/2021
by   A. Sheshkus, et al.
0

In this paper, we study the problem of feature points description in the context of document analysis and template matching. Our study shows that the specific training data is required for the task especially if we are to train a lightweight neural network that will be usable on devices with limited computational resources. In this paper, we construct and provide a dataset with a method of training patches retrieval. We prove the effectiveness of this data by training a lightweight neural network and show how it performs in both documents and general patches matching. The training was done on the provided dataset in comparison with HPatches training dataset and for the testing we use HPatches testing framework and two publicly available datasets with various documents pictured on complex backgrounds: MIDV-500 and MIDV-2019.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 2

page 4

page 5

07/16/2018

MIDV-500: A Dataset for Identity Documents Analysis and Recognition on Mobile Devices in Video Stream

A lot of research has been devoted to identity documents analysis and re...
10/22/2019

One-Shot Template Matching for Automatic Document Data Capture

In this paper, we propose a novel one-shot template-matching algorithm t...
01/29/2020

Comparison of scanned administrative document images

In this work the methods of comparison of digitized copies of administra...
01/27/2021

HDIB1M – Handwritten Document Image Binarization 1 Million Dataset

Handwritten document image binarization is a challenging task due to hig...
10/02/2020

Leveraging Semantic and Lexical Matching to Improve the Recall of Document Retrieval Systems: A Hybrid Approach

Search engines often follow a two-phase paradigm where in the first stag...
08/17/2018

First Steps Toward CNN based Source Classification of Document Images Shared Over Messaging App

Knowledge of source smartphone corresponding to a document image can be ...
03/19/2021

Connecting Images through Time and Sources: Introducing Low-data, Heterogeneous Instance Retrieval

With impressive results in applications relying on feature learning, dee...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.