SEMv2: Table Separation Line Detection Based on Conditional Convolution

03/08/2023
by   Zhenrong Zhang, et al.
0

Table structure recognition is an indispensable element for enabling machines to comprehend tables. Its primary purpose is to identify the internal structure of a table. Nevertheless, due to the complexity and diversity of their structure and style, it is highly challenging to parse the tabular data into a structured format that machines can comprehend. In this work, we adhere to the principle of the split-and-merge based methods and propose an accurate table structure recognizer, termed SEMv2 (SEM: Split, Embed and Merge). Unlike the previous works in the “split” stage, we aim to address the table separation line instance-level discrimination problem and introduce a table separation line detection strategy based on conditional convolution. Specifically, we design the “split” in a top-down manner that detects the table separation line instance first and then dynamically predicts the table separation line mask for each instance. The final table separation line shape can be accurately obtained by processing the table separation line mask in a row-wise/column-wise manner. To comprehensively evaluate the SEMv2, we also present a more challenging dataset for table structure recognition, dubbed iFLYTAB, which encompasses multiple style tables in various scenarios such as photos, scanned documents, etc. Extensive experiments on publicly available datasets (e.g. SciTSR, PubTabNet and iFLYTAB) demonstrate the efficacy of our proposed approach. The code and iFLYTAB dataset will be made publicly available upon acceptance of this paper.

READ FULL TEXT
research
07/12/2021

Split, embed and merge: An accurate table structure recognizer

The task of table structure recognition is to recognize the internal str...
research
08/09/2022

TSRFormer: Table Structure Recognition with Transformers

We present a new table structure recognition (TSR) approach, called TSRF...
research
03/21/2023

Robust Table Structure Recognition with Dynamic Queries Enhanced Detection Transformer

We present a new table structure recognition (TSR) approach, called TSRF...
research
09/28/2022

Revealing the Semantics of Data Wrangling Scripts With COMANTICS

Data workers usually seek to understand the semantics of data wrangling ...
research
08/12/2019

Two Dimensional Router: Design and Implementation

Higher dimensional classification has attracted more attentions with inc...
research
08/25/2020

CDeC-Net: Composite Deformable Cascade Network for Table Detection in Document Images

Localizing page elements/objects such as tables, figures, equations, etc...
research
06/13/2023

arXiVeri: Automatic table verification with GPT

Without accurate transcription of numerical data in scientific documents...

Please sign up or login with your details

Forgot password? Click here to reset