Image-to-Markup Generation with Coarse-to-Fine Attention

09/16/2016
by   Yuntian Deng, et al.
0

We present a neural encoder-decoder model to convert images into presentational markup based on a scalable coarse-to-fine attention mechanism. Our method is evaluated in the context of image-to-LaTeX generation, and we introduce a new dataset of real-world rendered mathematical expressions paired with LaTeX markup. We show that unlike neural OCR techniques using CTC-based models, attention-based approaches can tackle this non-standard OCR task. Our approach outperforms classical mathematical OCR systems by a large margin on in-domain rendered data, and, with pretraining, also performs well on out-of-domain handwritten data. To reduce the inference complexity associated with the attention-based approaches, we introduce a new coarse-to-fine attention layer that selects a support region before applying attention.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/29/2018

Attention-based End-to-End Models for Small-Footprint Keyword Spotting

In this paper, we propose an attention-based end-to-end neural approach ...
research
07/20/2020

Improving Attention-Based Handwritten Mathematical Expression Recognition with Scale Augmentation and Drop Attention

Handwritten mathematical expression recognition (HMER) is an important r...
research
06/17/2021

Unsupervised Training Data Generation of Handwritten Formulas using Generative Adversarial Networks with Self-Attention

The recognition of handwritten mathematical expressions in images and vi...
research
01/05/2018

Multi-Scale Attention with Dense Encoder for Handwritten Mathematical Expression Recognition

Handwritten mathematical expression recognition is a challenging problem...
research
04/26/2022

Coarse-to-fine Q-attention with Tree Expansion

Coarse-to-fine Q-attention enables sample-efficient robot manipulation b...
research
12/11/2017

Attention networks for image-to-text

The paper approaches the problem of image-to-text with attention-based e...
research
06/17/2023

GlyphNet: Homoglyph domains dataset and detection using attention-based Convolutional Neural Networks

Cyber attacks deceive machines into believing something that does not ex...

Please sign up or login with your details

Forgot password? Click here to reset