Enhancing spatial and textual analysis with EUPEG: an extensible and unified platform for evaluating geoparsers

07/09/2020
by   Jimin Wang, et al.
0

A rich amount of geographic information exists in unstructured texts, such as Web pages, social media posts, housing advertisements, and historical archives. Geoparsers are useful tools that extract structured geographic information from unstructured texts, thereby enabling spatial analysis on textual data. While a number of geoparsers were developed, they were tested on different datasets using different metrics. Consequently, it is difficult to compare existing geoparsers or to compare a new geoparser with existing ones. In recent years, researchers created open and annotated corpora for testing geoparsers. While these corpora are extremely valuable, much effort is still needed for a researcher to prepare these datasets and deploy geoparsers for comparative experiments. This paper presents EUPEG: an Extensible and Unified Platform for Evaluating Geoparsers. EUPEG is an open source and Web based benchmarking platform which hosts a majority of open corpora, geoparsers, and performance metrics reported in the literature. It enables direct comparison of the hosted geoparsers, and a new geoparser can be connected to EUPEG and compared with other geoparsers. The main objective of EUPEG is to reduce the time and effort that researchers have to spend in preparing datasets and baselines, thereby increasing the efficiency and effectiveness of comparative experiments.

READ FULL TEXT

page 8

page 22

page 23

page 26

research
07/08/2022

Emotion detection of social data: APIs comparative study

The development of emotion detection technology has emerged as a highly ...
research
04/08/2021

Media Cloud: Massive Open Source Collection of Global News on the Open Web

We present the first full description of Media Cloud, an open source pla...
research
05/03/2022

Themes of Revenge: Automatic Identification of Vengeful Content in Textual Data

Revenge is a powerful motivating force reported to underlie the behavior...
research
02/12/2018

Towards an Open Science Platform for the Evaluation of Data Fusion

Combining the results of different search engines in order to improve up...
research
10/04/2021

Benchmarking Data Lakes Featuring Structured and Unstructured Data with DLBench

In the last few years, the concept of data lake has become trendy for da...
research
07/04/2022

Location reference recognition from texts: A survey and comparison

A vast amount of location information exists in unstructured texts, such...
research
11/03/2020

Treebanking User-Generated Content: a UD Based Overview of Guidelines, Corpora and Unified Recommendations

This article presents a discussion on the main linguistic phenomena whic...

Please sign up or login with your details

Forgot password? Click here to reset