Robust clustering of languages across Wikipedia growth

09/18/2017
by   Kristina Ban, et al.
0

Wikipedia is the largest existing knowledge repository that is growing on a genuine crowdsourcing support. While the English Wikipedia is the most extensive and the most researched one with over five million articles, comparatively little is known about the behavior and growth of the remaining 283 smaller Wikipedias, the smallest of which, Afar, has only one article. Here we use a subset of this data, consisting of 14962 different articles, each of which exists in 26 different languages, from Arabic to Ukrainian. We study the growth of Wikipedias in these languages over a time span of 15 years. We show that, while an average article follows a random path from one language to another, there exist six well-defined clusters of Wikipedias that share common growth patterns. The make-up of these clusters is remarkably robust against the method used for their determination, as we verify via four different clustering methods. Interestingly, the identified Wikipedia clusters have little correlation with language families and groups. Rather, the growth of Wikipedia across different languages is governed by different factors, ranging from similarities in culture to information literacy.

READ FULL TEXT

page 4

page 5

research
01/04/2015

Cross-language Wikipedia Editing of Okinawa, Japan

This article analyzes users who edit Wikipedia articles about Okinawa, J...
research
07/26/2023

Measuring Americanization: A Global Quantitative Study of Interest in American Topics on Wikipedia

We conducted a global comparative analysis of the coverage of American t...
research
08/25/2022

Growth Rates of Knowledge

This is an evolving document. It is devoted to summarizing patterns and ...
research
02/02/2017

Analysing Temporal Evolution of Interlingual Wikipedia Article Pairs

Wikipedia articles representing an entity or a topic in different langua...
research
03/19/2018

Learning to Generate Wikipedia Summaries for Underserved Languages from Wikidata

While Wikipedia exists in 287 languages, its content is unevenly distrib...
research
03/30/2021

Tracking Knowledge Propagation Across Wikipedia Languages

In this paper, we present a dataset of inter-language knowledge propagat...
research
01/22/2018

Wikipedia in academia as a teaching tool: from averse to proactive faculty profiles

This study concerned the active use of Wikipedia as a teaching tool in t...

Please sign up or login with your details

Forgot password? Click here to reset