Robust Table Detection and Structure Recognition from Heterogeneous Document Images

03/17/2022
by   Chixiang Ma, et al.
0

We introduce a new table detection and structure recognition approach named RobusTabNet to detect the boundaries of tables and reconstruct the cellular structure of the table from heterogeneous document images. For table detection, we propose to use CornerNet as a new region proposal network to generate higher quality table proposals for Faster R-CNN, which has significantly improved the localization accuracy of Faster R-CNN for table detection. Consequently, our table detection approach achieves state-of-the-art performance on three public table detection benchmarks, namely cTDaR TrackA, PubLayNet and IIIT-AR-13K, by only using a lightweight ResNet-18 backbone network. Furthermore, we propose a new split-and-merge based table structure recognition approach, in which a novel spatial CNN based separation line prediction module is proposed to split each detected table into a grid of cells, and a Grid CNN based cell merging module is applied to recover the spanning cells. As the spatial CNN module can effectively propagate contextual information across the whole table image, our table structure recognizer can robustly recognize tables with large blank spaces and geometrically distorted (even curved) tables. Thanks to these two techniques, our table structure recognition approach achieves state-of-the-art performance on three public benchmarks, including SciTSR, PubTabNet and cTDaR TrackB. Moreover, we have further demonstrated the advantages of our approach in recognizing tables with complex structures, large blank spaces, empty or spanning cells as well as geometrically distorted or even curved tables on a more challenging in-house dataset.

READ FULL TEXT

page 8

page 14

research
08/09/2022

TSRFormer: Table Structure Recognition with Transformers

We present a new table structure recognition (TSR) approach, called TSRF...
research
03/21/2023

Robust Table Structure Recognition with Dynamic Queries Enhanced Detection Transformer

We present a new table structure recognition (TSR) approach, called TSRF...
research
08/31/2022

TRUST: An Accurate and End-to-End Table structure Recognizer Using Splitting-based Transformers

Table structure recognition is a crucial part of document image analysis...
research
11/13/2021

Visual Understanding of Complex Table Structures from Document Images

Table structure recognition is necessary for a comprehensive understandi...
research
08/13/2019

Complicated Table Structure Recognition

The task of table structure recognition aims to recognize the internal s...
research
09/06/2021

Parsing Table Structures in the Wild

This paper tackles the problem of table structure parsing (TSP) from ima...
research
11/03/2022

Efficient Information Sharing in ICT Supply Chain Social Network via Table Structure Recognition

The global Information and Communications Technology (ICT) supply chain ...

Please sign up or login with your details

Forgot password? Click here to reset