Web scraping: a promising tool for geographic data acquisition

05/31/2023
by   Alexander Brenning, et al.
0

With much of our lives taking place online, researchers are increasingly turning to information from the World Wide Web to gain insights into geographic patterns and processes. Web scraping as an online data acquisition technique allows us to gather intelligence especially on social and economic actions for which the Web serves as a platform. Specific opportunities relate to near-real-time access to object-level geolocated data, which can be captured in a cost-effective way. The studied geographic phenomena include, but are not limited to, the rental market and associated processes such as gentrification, entrepreneurial ecosystems, or spatial planning processes. Since the information retrieved from the Web is not made available for that purpose, Web scraping faces several unique challenges, several of which relate to location. Ethical and legal issues mainly relate to intellectual property rights, informed consent and (geo-) privacy, and website integrity and contract. These issues also effect the practice of open science. In addition, there are technical and statistical challenges that relate to dependability and incompleteness, data inconsistencies and bias, as well as the limited historical coverage. Geospatial analyses furthermore usually require the automated extraction and subsequent resolution of toponyms or addresses (geoparsing, geocoding). A study on apartment rent in Leipzig, Germany is used to illustrate the use of Web scraping and its challenges. We conclude that geographic researchers should embrace Web scraping as a powerful and affordable digital fieldwork tool while paying special attention to its legal, ethical, and methodological challenges.

READ FULL TEXT

page 3

page 8

page 12

page 14

research
06/13/2022

Tackling Algorithmic Disability Discrimination in the Hiring Process: An Ethical, Legal and Technical Analysis

Tackling algorithmic discrimination against persons with disabilities (P...
research
05/18/2020

Ethical Issues Regarding the Use of AI Profiling Services for Recruiting: The Japanese Rikunabi Data Scandal

The ethical, legal, and social challenges involved in the use of profili...
research
11/02/2016

Bots as Virtual Confederates: Design and Ethics

The use of bots as virtual confederates in online field experiments hold...
research
03/23/2023

Web 3.0: The Future of Internet

With the rapid growth of the Internet, human daily life has become deepl...
research
06/16/2021

Benefits, Challenges and Contributors to Success for National eHealth Systems Implementation: A Scoping Review

Our scoping review aims to assess what legal, ethical, and socio-technic...
research
09/23/2020

The Agent Web Model – Modelling web hacking for reinforcement learning

Website hacking is a frequent attack type used by malicious actors to ob...
research
03/27/2017

Discovering Scholarly Orphans Using ORCID

Archival efforts such as (C)LOCKSS and Portico are in place to ensure th...

Please sign up or login with your details

Forgot password? Click here to reset