TableSense: Spreadsheet Table Detection with Convolutional Neural Networks

06/25/2021
by   Haoyu Dong, et al.
0

Spreadsheet table detection is the task of detecting all tables on a given sheet and locating their respective ranges. Automatic table detection is a key enabling technique and an initial step in spreadsheet data intelligence. However, the detection task is challenged by the diversity of table structures and table layouts on the spreadsheet. Considering the analogy between a cell matrix as spreadsheet and a pixel matrix as image, and encouraged by the successful application of Convolutional Neural Networks (CNN) in computer vision, we have developed TableSense, a novel end-to-end framework for spreadsheet table detection. First, we devise an effective cell featurization scheme to better leverage the rich information in each cell; second, we develop an enhanced convolutional neural network model for table detection to meet the domain-specific requirement on precise table boundary detection; third, we propose an effective uncertainty metric to guide an active learning based smart sampling algorithm, which enables the efficient build-up of a training dataset with 22,176 tables on 10,220 sheets with broad coverage of diverse table structures and layouts. Our evaluation shows that TableSense is highly effective with 91.3% recall and 86.5% precision in EoB-2 metric, a significant improvement over both the current detection algorithm that are used in commodity spreadsheet tools and state-of-the-art convolutional neural networks in computer vision.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/22/2021

Tablext: A Combined Neural Network And Heuristic Based Table Extractor

A significant portion of the data available today is found within tables...
research
07/14/2022

DEXTER: An end-to-end system to extract table contents from electronic medical health documents

In this paper, we propose DEXTER, an end to end system to extract inform...
research
04/29/2021

Current Status and Performance Analysis of Table Recognition in Document Images with Deep Neural Networks

The first phase of table recognition is to detect the tabular area in a ...
research
04/20/2021

Table Tennis Stroke Recognition Using Two-Dimensional Human Pose Estimation

We introduce a novel method for collecting table tennis video data and p...
research
04/10/2023

Learning to Detect Touches on Cluttered Tables

We present a novel self-contained camera-projector tabletop system with ...
research
09/28/2022

Revealing the Semantics of Data Wrangling Scripts With COMANTICS

Data workers usually seek to understand the semantics of data wrangling ...
research
04/29/2021

TabAug: Data Driven Augmentation for Enhanced Table Structure Recognition

Table Structure Recognition is an essential part of end-to-end tabular d...

Please sign up or login with your details

Forgot password? Click here to reset