Improving Table Structure Recognition with Visual-Alignment Sequential Coordinate Modeling

03/13/2023
by   Yongshuai Huang, et al.
0

Table structure recognition aims to extract the logical and physical structure of unstructured table images into a machine-readable format. The latest end-to-end image-to-text approaches simultaneously predict the two structures by two decoders, where the prediction of the physical structure (the bounding boxes of the cells) is based on the representation of the logical structure. However, the previous methods struggle with imprecise bounding boxes as the logical representation lacks local visual information. To address this issue, we propose an end-to-end sequential modeling framework for table structure recognition called VAST. It contains a novel coordinate sequence decoder triggered by the representation of the non-empty cell from the logical structure decoder. In the coordinate sequence decoder, we model the bounding box coordinates as a language sequence, where the left, top, right and bottom coordinates are decoded sequentially to leverage the inter-coordinate dependency. Furthermore, we propose an auxiliary visual-alignment loss to enforce the logical representation of the non-empty cells to contain more local visual details, which helps produce better cell bounding boxes. Extensive experiments demonstrate that our proposed method can achieve state-of-the-art results in both logical and physical structure recognition. The ablation study also validates that the proposed coordinate sequence decoder and the visual-alignment loss are the keys to the success of our method.

READ FULL TEXT

page 1

page 8

page 12

page 14

research
03/14/2023

Rethinking Image-based Table Recognition Using Weakly Supervised Methods

Most of the previous methods for table recognition rely on training data...
research
03/08/2022

Table Structure Recognition with Conditional Attention

Tabular data in digital documents is widely used to express compact and ...
research
07/31/2022

Evaluating Table Structure Recognition: A New Perspective

Existing metrics used to evaluate table structure recognition algorithms...
research
05/13/2021

LGPMA: Complicated Table Structure Recognition with Local and Global Pyramid Mask Alignment

Table structure recognition is a challenging task due to the various str...
research
03/07/2023

LORE: Logical Location Regression Network for Table Structure Recognition

Table structure recognition (TSR) aims at extracting tables in images in...
research
11/04/2018

DeepKey: Towards End-to-End Physical Key Replication From a Single Photograph

This paper describes DeepKey, an end-to-end deep neural architecture cap...
research
11/28/2021

CHARTER: heatmap-based multi-type chart data extraction

The digital conversion of information stored in documents is a great sou...

Please sign up or login with your details

Forgot password? Click here to reset