EATEN: Entity-aware Attention for Single Shot Visual Text Extraction

09/20/2019
by   He Guo, et al.
13

Extracting entity from images is a crucial part of many OCR applications, such as entity recognition of cards, invoices, and receipts. Most of the existing works employ classical detection and recognition paradigm. This paper proposes an Entity-aware Attention Text Extraction Network called EATEN, which is an end-to-end trainable system to extract the entities without any post-processing. In the proposed framework, each entity is parsed by its corresponding entity-aware decoder, respectively. Moreover, we innovatively introduce a state transition mechanism which further improves the robustness of entity extraction. In consideration of the absence of public benchmarks, we construct a dataset of almost 0.6 million images in three real-world scenarios (train ticket, passport and business card), which is publicly available at https://github.com/beacandler/EATEN. To the best of our knowledge, EATEN is the first single shot method to extract entities from images. Extensive experiments on these benchmarks demonstrate the state-of-the-art performance of EATEN.

READ FULL TEXT

page 3

page 4

page 6

research
01/24/2021

Towards Robust Visual Information Extraction in Real World: New Dataset and Novel Solution

Visual information extraction (VIE) has attracted considerable attention...
research
03/23/2023

Modeling Entities as Semantic Points for Visual Information Extraction in the Wild

Recently, Visual Information Extraction (VIE) has been becoming increasi...
research
07/12/2022

OSLAT: Open Set Label Attention Transformer for Medical Entity Span Extraction

Identifying spans in medical texts that correspond to medical entities i...
research
07/20/2023

PPN: Parallel Pointer-based Network for Key Information Extraction with Complex Layouts

Key Information Extraction (KIE) is a challenging multimodal task that a...
research
06/27/2021

Effective Cascade Dual-Decoder Model for Joint Entity and Relation Extraction

Extracting relational triples from texts is a fundamental task in knowle...
research
07/08/2022

Lessons from Deep Learning applied to Scholarly Information Extraction: What Works, What Doesn't, and Future Directions

Understanding key insights from full-text scholarly articles is essentia...
research
09/13/2022

Entity Tagging: Extracting Entities in Text Without Mention Supervision

Detection and disambiguation of all entities in text is a crucial task f...

Please sign up or login with your details

Forgot password? Click here to reset