Attention networks for image-to-text

12/11/2017
by   Jason Poulos, et al.
0

The paper approaches the problem of image-to-text with attention-based encoder-decoder networks that are trained to handle sequences of characters rather than words. We experiment on lines of text from a popular handwriting database with different attention mechanisms for the decoder. The model trained with softmax attention achieves the lowest test error, outperforming several other RNN-based models. Our results show that softmax attention is able to learn a linear alignment whereas the alignment generated by sigmoid attention is linear but much less precise.

READ FULL TEXT

page 6

page 10

page 11

research
04/14/2021

Sparse Attention with Linear Units

Recently, it has been argued that encoder-decoder models can be made mor...
research
08/12/2020

Attention-based Fully Gated CNN-BGRU for Russian Handwritten Text

This research approaches the task of handwritten text with attention enc...
research
04/16/2016

Sentence-Level Grammatical Error Identification as Sequence-to-Sequence Correction

We demonstrate that an attention-based encoder-decoder model can be used...
research
08/26/2019

Adaptive Embedding Gate for Attention-Based Scene Text Recognition

Scene text recognition has attracted particular research interest becaus...
research
03/07/2022

A Glyph-driven Topology Enhancement Network for Scene Text Recognition

Attention-based methods by establishing one-dimensional (1D) and two-dim...
research
10/07/2019

MASTER: Multi-Aspect Non-local Network for Scene Text Recognition

Attention based scene text recognizers have gained huge success, which l...
research
09/16/2016

Image-to-Markup Generation with Coarse-to-Fine Attention

We present a neural encoder-decoder model to convert images into present...

Please sign up or login with your details

Forgot password? Click here to reset