Can You Read Me Now? Content Aware Rectification using Angle Supervision

08/05/2020
by   Amir Markovitz, et al.
0

The ubiquity of smartphone cameras has led to more and more documents being captured by cameras rather than scanned. Unlike flatbed scanners, photographed documents are often folded and crumpled, resulting in large local variance in text structure. The problem of document rectification is fundamental to the Optical Character Recognition (OCR) process on documents, and its ability to overcome geometric distortions significantly affects recognition accuracy. Despite the great progress in recent OCR systems, most still rely on a pre-process that ensures the text lines are straight and axis aligned. Recent works have tackled the problem of rectifying document images taken in-the-wild using various supervision signals and alignment means. However, they focused on global features that can be extracted from the document's boundaries, ignoring various signals that could be obtained from the document's content. We present CREASE: Content Aware Rectification using Angle Supervision, the first learned method for document rectification that relies on the document's content, the location of the words and specifically their orientation, as hints to assist in the rectification process. We utilize a novel pixel-wise angle regression approach and a curvature estimation side-task for optimizing our rectification model. Our method surpasses previous approaches in terms of OCR accuracy, geometric error and visual similarity.

READ FULL TEXT

page 2

page 7

page 10

page 13

page 18

page 20

page 21

page 22

research
12/16/2022

Geometric Rectification of Creased Document Images based on Isometric Mapping

Geometric rectification of images of distorted documents finds wide appl...
research
10/14/2022

Text Detection Forgot About Document OCR

Detection and recognition of text from scans and other images, commonly ...
research
03/27/2020

Source Printer Identification from Document Images Acquired using Smartphone

Vast volumes of printed documents continue to be used for various import...
research
01/27/2017

Document Decomposition of Bangla Printed Text

Today all kind of information is getting digitized and along with all th...
research
07/28/2017

FontCode: Embedding Information in Text Documents using Glyph Perturbation

We introduce FontCode, an information embedding technique for text docum...
research
07/24/2023

MataDoc: Margin and Text Aware Document Dewarping for Arbitrary Boundary

Document dewarping from a distorted camera-captured image is of great va...
research
10/28/2021

DocScanner: Robust Document Image Rectification with Progressive Learning

Compared to flatbed scanners, portable smartphones are much more conveni...

Please sign up or login with your details

Forgot password? Click here to reset