Deep Structured Output Learning for Unconstrained Text Recognition

12/18/2014
by   Max Jaderberg, et al.
0

We develop a representation suitable for the unconstrained recognition of words in natural images: the general case of no fixed lexicon and unknown length. To this end we propose a convolutional neural network (CNN) based architecture which incorporates a Conditional Random Field (CRF) graphical model, taking the whole word image as a single input. The unaries of the CRF are provided by a CNN that predicts characters at each position of the output, while higher order terms are provided by another CNN that detects the presence of N-grams. We show that this entire model (CRF, character predictor, N-gram predictor) can be jointly optimised by back-propagating the structured output loss, essentially requiring the system to perform multi-task learning, and training uses purely synthetically generated data. The resulting model is a more accurate system on standard real-world text recognition benchmarks than character prediction alone, setting a benchmark for systems that have not been trained on a particular lexicon. In addition, our model achieves state-of-the-art accuracy in lexicon-constrained scenarios, without being specifically modelled for constrained recognition. To test the generalisation of our model, we also perform experiments with random alpha-numeric strings to evaluate the method when no visual language model is applicable.

READ FULL TEXT
research
11/13/2017

Conditional Random Field and Deep Feature Learning for Hyperspectral Image Segmentation

Image segmentation is considered to be one of the critical tasks in hype...
research
07/20/2016

Sequence to sequence learning for unconstrained scene text recognition

In this work we present a state-of-the-art approach for unconstrained na...
research
12/31/2018

Accurate, Data-Efficient, Unconstrained Text Recognition with Convolutional Neural Networks

Unconstrained text recognition is an important computer vision task, fea...
research
06/18/2019

A Conditional Random Field Model for Context Aware Cloud Detection in Sky Images

A conditional random field (CRF) model for cloud detection in ground bas...
research
01/29/2018

End-to-End Fine-Grained Action Segmentation and Recognition Using Conditional Random Field Models and Discriminative Sparse Coding

Fine-grained action segmentation and recognition is an important yet cha...
research
08/27/2018

Fast and Accurate Recognition of Chinese Clinical Named Entities with Residual Dilated Convolutions

Clinical Named Entity Recognition (CNER) aims to identify and classify c...
research
05/06/2019

Simulating CRF with CNN for CNN

Combining CNN with CRF for modeling dependencies between pixel labels is...

Please sign up or login with your details

Forgot password? Click here to reset