Digital Peter: Dataset, Competition and Handwriting Recognition Methods

03/16/2021
by   Mark Potanin, et al.
0

This paper presents a new dataset of Peter the Great's manuscripts and describes a segmentation procedure that converts initial images of documents into the lines. The new dataset may be useful for researchers to train handwriting text recognition models as a benchmark for comparing different models. It consists of 9 694 images and text files corresponding to lines in historical documents. The open machine learning competition Digital Peter was held based on the considered dataset. The baseline solution for this competition as well as more advanced methods on handwritten text recognition are described in the article. Full dataset and all code are publicly available.

READ FULL TEXT

Authors

page 5

page 8

10/20/2016

An Image Dataset of Text Patches in Everyday Scenes

This paper describes a dataset containing small images of text from ever...
03/18/2021

ICDAR2019 Competition on Scanned Receipt OCR and Information Extraction

Scanned receipts OCR and key information extraction (SROIE) represent th...
06/26/2019

The UN Security Council debates 1995-2017

This paper presents a new dataset containing 65,393 speeches held in the...
11/10/2018

Handwriting Recognition of Historical Documents with few labeled data

Historical documents present many challenges for offline handwriting rec...
05/27/2021

ICDAR 2021 Competition on Historical Map Segmentation

This paper presents the final results of the ICDAR 2021 Competition on H...
12/04/2020

Boosting offline handwritten text recognition in historical documents with few labeled lines

In this paper, we face the problem of offline handwritten text recogniti...
06/16/2021

ICDAR 2021 Competition on Components Segmentation Task of Document Photos

This paper describes the short-term competition on Components Segmentati...

Code Repositories

digital_peter_aij2020

Materials of the AI Journey 2020 competition dedicated to the recognition of Peter the Great's manuscripts, https://ai-journey.ru/contest/task01


view repo
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.