MIDV-2020: A Comprehensive Benchmark Dataset for Identity Document Analysis

07/01/2021
by   Konstantin Bulatov, et al.
0

Identity documents recognition is an important sub-field of document analysis, which deals with tasks of robust document detection, type identification, text fields recognition, as well as identity fraud prevention and document authenticity validation given photos, scans, or video frames of an identity document capture. Significant amount of research has been published on this topic in recent years, however a chief difficulty for such research is scarcity of datasets, due to the subject matter being protected by security requirements. A few datasets of identity documents which are available lack diversity of document types, capturing conditions, or variability of document field values. In addition, the published datasets were typically designed only for a subset of document recognition problems, not for a complex identity document analysis. In this paper, we present a dataset MIDV-2020 which consists of 1000 video clips, 2000 scanned images, and 1000 photos of 1000 unique mock identity documents, each with unique text field values and unique artificially generated faces, with rich annotation. For the presented benchmark dataset baselines are provided for such tasks as document location and identification, text fields recognition, and face detection. With 72409 annotated images in total, to the date of publication the proposed dataset is the largest publicly available identity documents dataset with variable artificially generated data, and we believe that it will prove invaluable for advancement of the field of document analysis and recognition. The dataset is available for download at ftp://smartengines.com/midv-2020 and http://l3i-share.univ-lr.fr .

READ FULL TEXT

page 4

page 5

page 7

page 8

page 9

page 10

page 11

page 14

research
07/16/2018

A Dataset for Identity Documents Analysis and Recognition on Mobile Devices in Video Stream

A lot of research has been devoted to identity documents analysis and re...
research
06/12/2018

Is India's Unique Identification Number a legally valid identification?

A legally valid identification document allows impartial arbitration of ...
research
06/26/2020

An Automatic Reader of Identity Documents

Identity documents automatic reading and verification is an appealing te...
research
06/16/2021

ICDAR 2021 Competition on Components Segmentation Task of Document Photos

This paper describes the short-term competition on Components Segmentati...
research
12/04/2019

A Method of Fluorescent Fibers Detection on Identity Documents under Ultraviolet Light

In this work we consider the problem of the fluorescent security fibers ...
research
10/21/2016

Automated Big Text Security Classification

In recent years, traditional cybersecurity safeguards have proven ineffe...
research
10/20/2019

Identity Document and banknote security forensics: a survey

Counterfeiting and piracy are a form of theft that has been steadily gro...

Please sign up or login with your details

Forgot password? Click here to reset