Transcribing Content from Structural Images with Spotlight Mechanism

05/27/2019
by   Yu Yin, et al.
0

Transcribing content from structural images, e.g., writing notes from music scores, is a challenging task as not only the content objects should be recognized, but the internal structure should also be preserved. Existing image recognition methods mainly work on images with simple content (e.g., text lines with characters), but are not capable to identify ones with more complex content (e.g., structured symbols), which often follow a fine-grained grammar. To this end, in this paper, we propose a hierarchical Spotlight Transcribing Network (STN) framework followed by a two-stage "where-to-what" solution. Specifically, we first decide "where-to-look" through a novel spotlight mechanism to focus on different areas of the original image following its structure. Then, we decide "what-to-write" by developing a GRU based network with the spotlight areas for transcribing the content accordingly. Moreover, we propose two implementations on the basis of STN, i.e., STNM and STNR, where the spotlight movement follows the Markov property and Recurrent modeling, respectively. We also design a reinforcement method to refine the framework by self-improving the spotlight mechanism. We conduct extensive experiments on many structural image datasets, where the results clearly demonstrate the effectiveness of STN framework.

READ FULL TEXT
research
06/14/2018

Multi-Attention Multi-Class Constraint for Fine-grained Image Recognition

Attention-based learning for fine-grained image recognition remains a ch...
research
10/23/2021

Attend and Guide (AG-Net): A Keypoints-driven Attention-based Deep Network for Image Recognition

This paper presents a novel keypoints-based attention mechanism for visu...
research
05/18/2023

RMSSinger: Realistic-Music-Score based Singing Voice Synthesis

We are interested in a challenging task, Realistic-Music-Score based Sin...
research
10/31/2022

Tables to LaTeX: structure and content extraction from scientific tables

Scientific documents contain tables that list important information in a...
research
12/06/2021

SelectAugment: Hierarchical Deterministic Sample Selection for Data Augmentation

Data augmentation (DA) has been widely investigated to facilitate model ...
research
08/08/2019

Editing Text in the Wild

In this paper, we are interested in editing text in natural images, whic...

Please sign up or login with your details

Forgot password? Click here to reset