Longitudinal Assessment of Reference Quality on Wikipedia

by   Aitolkyn Baigutanova, et al.

Wikipedia plays a crucial role in the integrity of the Web. This work analyzes the reliability of this global encyclopedia through the lens of its references. We operationalize the notion of reference quality by defining reference need (RN), i.e., the percentage of sentences missing a citation, and reference risk (RR), i.e., the proportion of non-authoritative references. We release Citation Detective, a tool for automatically calculating the RN score, and discover that the RN score has dropped by 20 percent point in the last decade, with more than half of verifiable statements now accompanying references. The RR score has remained below 1 the efforts of the community to eliminate unreliable references. We propose pairing novice and experienced editors on the same Wikipedia article as a strategy to enhance reference quality. Our quasi-experiment indicates that such a co-editing experience can result in a lasting advantage in identifying unreliable sources in future edits. As Wikipedia is frequently used as the ground truth for numerous Web applications, our findings and suggestions on its reliability can have a far-reaching impact. We discuss the possibility of other Web services adopting Wiki-style user collaboration to eliminate unreliable content.


page 1

page 2

page 3

page 4


A Comparative Study of Reference Reliability in Multiple Language Editions of Wikipedia

Information presented in Wikipedia articles must be attributable to reli...

References in Wikipedia: The Editors' Perspective

References are an essential part of Wikipedia. Each statement in Wikiped...

'I Updated the <ref>': The Evolution of References in the English Wikipedia and the Implications for Altmetrics

With this work, we present a publicly available dataset of the history o...

Approaches for Enriching and Improving Textual Knowledge Bases

Verifiability is one of the core editing principles in Wikipedia, where ...

Assessing the quality of sources in Wikidata across languages: a hybrid approach

Wikidata is one of the most important sources of structured data on the ...

Surfer100: Generating Surveys From Web Resources on Wikipedia-style

Fast-developing fields such as Artificial Intelligence (AI) often outpac...

Learning to Revise References for Faithful Summarization

In many real-world scenarios with naturally occurring datasets, referenc...

Please sign up or login with your details

Forgot password? Click here to reset