A Comparative Analysis of Content-based Geolocation in Blogs and Tweets

11/19/2018
by   Konstantinos Pappas, et al.
0

The geolocation of online information is an essential component in any geospatial application. While most of the previous work on geolocation has focused on Twitter, in this paper we quantify and compare the performance of text-based geolocation methods on social media data drawn from both Blogger and Twitter. We introduce a novel set of location specific features that are both highly informative and easily interpretable, and show that we can achieve error rate reductions of up to 12.5 geolocation features. We also show that despite posting longer text, Blogger users are significantly harder to geolocate than Twitter users. Additionally, we investigate the effect of training and testing on different media (cross-media predictions), or combining multiple social media sources (multi-media predictions). Finally, we explore the geolocability of social media in relation to three user dimensions: state, gender, and industry.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/15/2021

A Dataset of State-Censored Tweets

Many governments impose traditional censorship methods on social media p...
research
10/15/2019

Multi-dimensional Features for Prediction with Tweets

With the rise of opioid abuse in the US, there has been a growth of over...
research
04/18/2017

25 Tweets to Know You: A New Model to Predict Personality with Social Media

Predicting personality is essential for social applications supporting h...
research
08/24/2021

Morality-based Assertion and Homophily on Social Media: A Cultural Comparison between English and Japanese Languages

Moral psychology is a domain that deals with moral identity, appraisals ...
research
01/27/2021

Deriving the Traveler Behavior Information from Social Media: A Case Study in Manhattan with Twitter

Social media platforms, such as Twitter, provide a totally new perspecti...
research
06/14/2023

Extracting Information from Twitter Screenshots

Screenshots are prevalent on social media as a common approach for infor...
research
11/12/2021

RATE: Overcoming Noise and Sparsity of Textual Features in Real-Time Location Estimation

Real-time location inference of social media users is the fundamental of...

Please sign up or login with your details

Forgot password? Click here to reset