DeepAI AI Chat
Log In Sign Up

TableSense: Spreadsheet Table Detection with Convolutional Neural Networks

by   Haoyu Dong, et al.

Spreadsheet table detection is the task of detecting all tables on a given sheet and locating their respective ranges. Automatic table detection is a key enabling technique and an initial step in spreadsheet data intelligence. However, the detection task is challenged by the diversity of table structures and table layouts on the spreadsheet. Considering the analogy between a cell matrix as spreadsheet and a pixel matrix as image, and encouraged by the successful application of Convolutional Neural Networks (CNN) in computer vision, we have developed TableSense, a novel end-to-end framework for spreadsheet table detection. First, we devise an effective cell featurization scheme to better leverage the rich information in each cell; second, we develop an enhanced convolutional neural network model for table detection to meet the domain-specific requirement on precise table boundary detection; third, we propose an effective uncertainty metric to guide an active learning based smart sampling algorithm, which enables the efficient build-up of a training dataset with 22,176 tables on 10,220 sheets with broad coverage of diverse table structures and layouts. Our evaluation shows that TableSense is highly effective with 91.3% recall and 86.5% precision in EoB-2 metric, a significant improvement over both the current detection algorithm that are used in commodity spreadsheet tools and state-of-the-art convolutional neural networks in computer vision.


page 1

page 2

page 3

page 4


Tablext: A Combined Neural Network And Heuristic Based Table Extractor

A significant portion of the data available today is found within tables...

DEXTER: An end-to-end system to extract table contents from electronic medical health documents

In this paper, we propose DEXTER, an end to end system to extract inform...

Table Tennis Stroke Recognition Using Two-Dimensional Human Pose Estimation

We introduce a novel method for collecting table tennis video data and p...

Current Status and Performance Analysis of Table Recognition in Document Images with Deep Neural Networks

The first phase of table recognition is to detect the tabular area in a ...

Structure-aware Pre-training for Table Understanding with Tree-based Transformers

Tables are widely used with various structures to organize and present d...

Revealing the Semantics of Data Wrangling Scripts With COMANTICS

Data workers usually seek to understand the semantics of data wrangling ...

A Saliency-based Convolutional Neural Network for Table and Chart Detection in Digitized Documents

Deep Convolutional Neural Networks (DCNNs) have recently been applied su...