Recognition of Implicit Geographic Movement in Text

01/30/2022
by   Scott Pezanowski, et al.
0

Analyzing the geographic movement of humans, animals, and other phenomena is a growing field of research. This research has benefited urban planning, logistics, animal migration understanding, and much more. Typically, the movement is captured as precise geographic coordinates and time stamps with Global Positioning Systems (GPS). Although some research uses computational techniques to take advantage of implicit movement in descriptions of route directions, hiking paths, and historical exploration routes, innovation would accelerate with a large and diverse corpus. We created a corpus of sentences labeled as describing geographic movement or not and including the type of entity moving. Creating this corpus proved difficult without any comparable corpora to start with, high human labeling costs, and since movement can at times be interpreted differently. To overcome these challenges, we developed an iterative process employing hand labeling, crowd voting for confirmation, and machine learning to predict more labels. By merging advances in word embeddings with traditional machine learning models and model ensembling, prediction accuracy is at an acceptable level to produce a large silver-standard corpus despite the small gold-standard corpus training set. Our corpus will likely benefit computational processing of geography in text and spatial cognition, in addition to detection of movement.

READ FULL TEXT

page 2

page 7

research
01/12/2022

Differentiating Geographic Movement Described in Text Documents

Understanding movement described in text documents is important since te...
research
04/23/2018

Can Eye Movement Data Be Used As Ground Truth For Word Embeddings Evaluation?

In recent years a certain success in the task of modeling lexical semant...
research
11/06/2017

Evaluation of Croatian Word Embeddings

Croatian is poorly resourced and highly inflected language from Slavic l...
research
03/20/2019

Machine Learning for Data-Driven Movement Generation: a Review of the State of the Art

The rise of non-linear and interactive media such as video games has inc...
research
05/31/2021

Corpus-Based Paraphrase Detection Experiments and Review

Paraphrase detection is important for a number of applications, includin...
research
05/31/2022

APPReddit: a Corpus of Reddit Posts Annotated for Appraisal

Despite the large number of computational resources for emotion recognit...
research
05/01/2016

Text-mining the NeuroSynth corpus using Deep Boltzmann Machines

Large-scale automated meta-analysis of neuroimaging data has recently es...

Please sign up or login with your details

Forgot password? Click here to reset