Towards Pure End-to-End Learning for Recognizing Multiple Text Sequences from an Image

07/30/2019
by   Xu Zhenlong, et al.
11

Here we address a challenging problem: recognizing multiple text sequences from an image by pure end-to-end learning. It is twofold: 1) Multiple text sequences recognition. Each image may contain multiple text sequences of different content, location and orientation, and we try to recognize all the text sequences contained in the image. 2) Pure end-to-end (PEE) learning.We solve the problem in a pure end-to-end learning way where each training image is labeled by only text transcripts of all contained sequences, without any geometric annotations. Most existing works recognize multiple text sequences from an image in a non-end-to-end (NEE) or quasi-end-to-end (QEE) way, in which each image is trained with both text transcripts and text locations.Only recently, a PEE method was proposed to recognize text sequences from an image where the text sequence was split to several lines in the image. However, it cannot be directly applied to recognizing multiple text sequences from an image. So in this paper, we propose a pure end-to-end learning method to recognize multiple text sequences from an image. Our method directly learns multiple sequences of probability distribution conditioned on each input image, and outputs multiple text transcripts with a well-designed decoding strategy.To evaluate the proposed method, we constructed several datasets mainly based on an existing public dataset andtwo real application scenarios. Experimental results show that the proposed method can effectively recognize multiple text sequences from images, and outperforms CTC-based and attention-based baseline methods.

READ FULL TEXT

page 2

page 4

page 6

page 8

research
07/27/2017

STN-OCR: A single Neural Network for Text Detection and Text Recognition

Detecting and recognizing text in natural scene images is a challenging,...
research
12/14/2017

SEE: Towards Semi-Supervised End-to-End Scene Text Recognition

Detecting and recognizing text in natural scene images is a challenging,...
research
03/10/2022

DEER: Detection-agnostic End-to-End Recognizer for Scene Text Spotting

Recent end-to-end scene text spotters have achieved great improvement in...
research
12/24/2014

Multiple Object Recognition with Visual Attention

We present an attention-based model for recognizing multiple objects in ...
research
11/17/2020

Digging Deeper into CRNN Model in Chinese Text Images Recognition

Automatic text image recognition is a prevalent application in computer ...
research
03/21/2023

Automatic evaluation of herding behavior in towed fishing gear using end-to-end training of CNN and attention-based networks

This paper considers the automatic classification of herding behavior in...
research
05/11/2022

TextMatcher: Cross-Attentional Neural Network to Compare Image and Text

We study a novel multimodal-learning problem, which we call text matchin...

Please sign up or login with your details

Forgot password? Click here to reset