TableFormer: Table Structure Understanding with Transformers

03/02/2022
by   Ahmed Nassar, et al.
0

Tables organize valuable content in a concise and compact representation. This content is extremely valuable for systems such as search engines, Knowledge Graph's, etc, since they enhance their predictive capabilities. Unfortunately, tables come in a large variety of shapes and sizes. Furthermore, they can have complex column/row-header configurations, multiline rows, different variety of separation lines, missing entries, etc. As such, the correct identification of the table-structure from an image is a non-trivial task. In this paper, we present a new table-structure identification model. The latter improves the latest end-to-end deep learning model (i.e. encoder-dual-decoder from PubTabNet) in two significant ways. First, we introduce a new object detection decoder for table-cells. In this way, we can obtain the content of the table-cells from programmatic PDF's directly from the PDF source and avoid the training of the custom OCR decoders. This architectural change leads to more accurate table-content extraction and allows us to tackle non-english tables. Second, we replace the LSTM decoders with transformer based decoders. This upgrade improves significantly the previous state-of-the-art tree-editing-distance-score (TEDS) from 91 tables and from 88.7

READ FULL TEXT

page 3

page 5

page 8

research
11/13/2021

Visual Understanding of Complex Table Structures from Document Images

Table structure recognition is necessary for a comprehensive understandi...
research
06/08/2022

STable: Table Generation Framework for Encoder-Decoder Models

The output structure of database-like tables, consisting of values struc...
research
01/13/2020

Identifying Table Structure in Documents using Conditional Generative Adversarial Networks

In many industries, as well as in academic research, information is prim...
research
08/09/2022

TSRFormer: Table Structure Recognition with Transformers

We present a new table structure recognition (TSR) approach, called TSRF...
research
04/03/2019

Extracting Tables from Documents using Conditional Generative Adversarial Networks and Genetic Algorithms

Extracting information from tables in documents presents a significant c...
research
10/05/2020

TabEAno: Table to Knowledge Graph Entity Annotation

In the Open Data era, a large number of table resources have been made a...
research
03/21/2023

Robust Table Structure Recognition with Dynamic Queries Enhanced Detection Transformer

We present a new table structure recognition (TSR) approach, called TSRF...

Please sign up or login with your details

Forgot password? Click here to reset