Citation Needed: A Taxonomy and Algorithmic Assessment of Wikipedia's Verifiability

02/28/2019
by   Miriam Redi, et al.
0

Wikipedia is playing an increasingly central role on the web,and the policies its contributors follow when sourcing and fact-checking content affect million of readers. Among these core guiding principles, verifiability policies have a particularly important role. Verifiability requires that information included in a Wikipedia article be corroborated against reliable secondary sources. Because of the manual labor needed to curate and fact-check Wikipedia at scale, however, its contents do not always evenly comply with these policies. Citations (i.e. reference to external sources) may not conform to verifiability requirements or may be missing altogether, potentially weakening the reliability of specific topic areas of the free encyclopedia. In this paper, we aim to provide an empirical characterization of the reasons why and how Wikipedia cites external sources to comply with its own verifiability guidelines. First, we construct a taxonomy of reasons why inline citations are required by collecting labeled data from editors of multiple Wikipedia language editions. We then collect a large-scale crowdsourced dataset of Wikipedia sentences annotated with categories derived from this taxonomy. Finally, we design and evaluate algorithmic models to determine if a statement requires a citation, and to predict the citation reason based on our taxonomy. We evaluate the robustness of such models across different classes of Wikipedia articles of varying quality, as well as on an additional dataset of claims annotated for fact-checking purposes.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/14/2020

Wikipedia Citations: A comprehensive dataset of citations with identifiers extracted from English Wikipedia

Wikipedia's contents are based on reliable and published sources. To thi...
research
06/30/2021

A preliminary approach to knowledge integrity risk assessment in Wikipedia projects

Wikipedia is one of the main repositories of free knowledge available to...
research
07/23/2017

Fine Grained Citation Span for References in Wikipedia

Verifiability is one of the core editing principles in Wikipedia, editor...
research
07/08/2022

Improving Wikipedia Verifiability with AI

Verifiability is a core content policy of Wikipedia: claims that are lik...
research
10/14/2019

Online Disinformation and the Role of Wikipedia

The aim of this study is to find key areas of research that can be usefu...
research
10/13/2021

Refcat: The Internet Archive Scholar Citation Graph

As part of its scholarly data efforts, the Internet Archive (IA) release...
research
04/20/2018

Approaches for Enriching and Improving Textual Knowledge Bases

Verifiability is one of the core editing principles in Wikipedia, where ...

Please sign up or login with your details

Forgot password? Click here to reset