Scatteract: Automated extraction of data from scatter plots

04/21/2017
by   Mathieu Cliche, et al.
0

Charts are an excellent way to convey patterns and trends in data, but they do not facilitate further modeling of the data or close inspection of individual data points. We present a fully automated system for extracting the numerical values of data points from images of scatter plots. We use deep learning techniques to identify the key components of the chart, and optical character recognition together with robust regression to map from pixels to the coordinate system of the chart. We focus on scatter plots with linear scales, which already have several interesting challenges. Previous work has done fully automatic extraction for other types of charts, but to our knowledge this is the first approach that is fully automatic for scatter plots. Our method performs well, achieving successful data extraction on 89 test set.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/07/2020

A Gaussian Process Upsampling Model for Improvements in Optical Character Recognition

Optical Character Recognition and extraction is a key tool in the automa...
research
07/29/2019

Computing the Value of Data: Towards Applied Data Minimalism

We present an approach to compute the monetary value of individual data ...
research
10/04/2016

Micro-Data Learning: The Other End of the Spectrum

Many fields are now snowed under with an avalanche of data, which raises...
research
09/26/2021

Automated Multi-Process CTC Detection using Deep Learning

Circulating Tumor Cells (CTCs) bear great promise as biomarkers in tumor...
research
10/07/2018

A Survey of Neighbourhood Construction Models for Categorizing Data Points

Finding neighbourhood structures is very useful in extracting valuable r...
research
06/09/2021

DREAMS: Drilling and Extraction Automated System

Drilling and Extraction Automated System (DREAMS) is a fully automated p...
research
07/17/2023

SEMI-DiffusionInst: A Diffusion Model Based Approach for Semiconductor Defect Classification and Segmentation

With continuous progression of Moore's Law, integrated circuit (IC) devi...

Please sign up or login with your details

Forgot password? Click here to reset