Mining Spatio-temporal Data on Industrialization from Historical Registries

12/03/2016
by   David Berenbaum, et al.
0

Despite the growing availability of big data in many fields, historical data on socioevironmental phenomena are often not available due to a lack of automated and scalable approaches for collecting, digitizing, and assembling them. We have developed a data-mining method for extracting tabulated, geocoded data from printed directories. While scanning and optical character recognition (OCR) can digitize printed text, these methods alone do not capture the structure of the underlying data. Our pipeline integrates both page layout analysis and OCR to extract tabular, geocoded data from structured text. We demonstrate the utility of this method by applying it to scanned manufacturing registries from Rhode Island that record 41 years of industrial land use. The resulting spatio-temporal data can be used for socioenvironmental analyses of industrialization at a resolution that was not previously possible. In particular, we find strong evidence for the dispersion of manufacturing from the urban core of Providence, the state's capital, along the Interstate 95 corridor to the north and south.

READ FULL TEXT

page 6

page 7

page 8

research
11/13/2017

Spatio-Temporal Data Mining: A Survey of Problems and Methods

Large volumes of spatio-temporal data are increasingly collected and stu...
research
03/17/2021

A Survey on Spatio-temporal Data Analytics Systems

Due to the surge of spatio-temporal data volume, the popularity of locat...
research
06/11/2019

Deep Learning for Spatio-Temporal Data Mining: A Survey

With the fast development of various positioning techniques such as Glob...
research
09/18/2023

PromptST: Prompt-Enhanced Spatio-Temporal Multi-Attribute Prediction

In the era of information explosion, spatio-temporal data mining serves ...
research
04/23/2020

Real-time Detection of Clustered Events in Video-imaging data with Applications to Additive Manufacturing

The use of video-imaging data for in-line process monitoring application...
research
02/02/2019

Big Data and Geospatial Analysis

Perhaps one of the mostly hotly debated topics in recent years has been ...
research
12/27/2016

Classifying Patents Based on their Semantic Content

In this paper, we extend some usual techniques of classification resulti...

Please sign up or login with your details

Forgot password? Click here to reset