Learning Deep Representations for Word Spotting Under Weak Supervision

12/01/2017
by   Sebastian Sudholt, et al.
0

Convolutional Neural Networks have made their mark in various fields of computer vision in recent years. They have achieved state-of-the-art performance in the field of document analysis as well. However, CNNs require a large amount of annotated training data and, hence, great manual effort. In our approach, we introduce a method to drastically reduce the manual annotation effort while retaining the high performance of a CNN for word spotting in handwritten documents. The model is learned with weak supervision using a combination of synthetically generated training data and a small subset of the training partition of the handwritten data set. We show that the network achieves results highly competitive to the state-of-the-art in word spotting with shorter training times and a fraction of the annotation effort.

READ FULL TEXT
research
04/01/2016

PHOCNet: A Deep Convolutional Neural Network for Word Spotting in Handwritten Documents

In recent years, deep convolutional neural networks have achieved state ...
research
12/20/2017

Attribute CNNs for Word Spotting in Handwritten Documents

Word spotting has become a field of strong research interest in document...
research
03/04/2020

Annotation-free Learning of Deep Representations for Word Spotting using Synthetic Data and Self Labeling

Word spotting is a popular tool for supporting the first exploration of ...
research
08/17/2022

DeeperDive: The Unreasonable Effectiveness of Weak Supervision in Document Understanding A Case Study in Collaboration with UiPath Inc

Weak supervision has been applied to various Natural Language Understand...
research
02/17/2018

HWNet v2: An Efficient Word Image Representation for Handwritten Documents

We present a framework for learning efficient holistic representation fo...
research
11/22/2017

TexT - Text Extractor Tool for Handwritten Document Transcription and Annotation

This paper presents a framework for semi-automatic transcription of larg...
research
03/24/2020

Bootstrapping Weakly Supervised Segmentation-free Word Spotting through HMM-based Alignment

Recent work in word spotting in handwritten documents has yielded impres...

Please sign up or login with your details

Forgot password? Click here to reset