Language-agnostic Topic Classification for Wikipedia

02/26/2021
by   Isaac Johnson, et al.
0

A major challenge for many analyses of Wikipedia dynamics – e.g., imbalances in content quality, geographic differences in what content is popular, what types of articles attract more editor discussion – is grouping the very diverse range of Wikipedia articles into coherent, consistent topics. This problem has been addressed using various approaches based on Wikipedia's category network, WikiProjects, and external taxonomies. However, these approaches have always been limited in their coverage: typically, only a small subset of articles can be classified, or the method cannot be applied across (the more than 300) languages on Wikipedia. In this paper, we propose a language-agnostic approach based on the links in an article for classifying articles into a taxonomy of topics that can be easily applied to (almost) any language and article on Wikipedia. We show that it matches the performance of a language-dependent approach while being simpler and having much greater coverage.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/18/2019

Societal Controversies in Wikipedia Articles

Collaborative content creation inevitably reaches situations where diffe...
research
07/26/2023

Measuring Americanization: A Global Quantitative Study of Interest in American Topics on Wikipedia

We conducted a global comparative analysis of the coverage of American t...
research
09/23/2020

Crosslingual Topic Modeling with WikiPDA

We present Wikipedia-based Polyglot Dirichlet Allocation (WikiPDA), a cr...
research
02/17/2020

What is Trending on Wikipedia? Capturing Trends and Language Biases Across Wikipedia Editions

In this work, we propose an automatic evaluation and comparison of the b...
research
09/18/2018

Mind Your POV: Convergence of Articles and Editors Towards Wikipedia's Neutrality Norm

Wikipedia has a strong norm of writing in a 'neutral point of view' (NPO...
research
11/02/2021

Quality change: norm or exception? Measurement, Analysis and Detection of Quality Change in Wikipedia

Wikipedia has been turned into an immensely popular crowd-sourced encycl...
research
01/04/2017

World Literature According to Wikipedia: Introduction to a DBpedia-Based Framework

Among the manifold takes on world literature, it is our goal to contribu...

Please sign up or login with your details

Forgot password? Click here to reset