Wikipedia Vandal Early Detection: from User Behavior to User Embedding

06/03/2017
by   Shuhan Yuan, et al.
0

Wikipedia is the largest online encyclopedia that allows anyone to edit articles. In this paper, we propose the use of deep learning to detect vandals based on their edit history. In particular, we develop a multi-source long-short term memory network (M-LSTM) to model user behaviors by using a variety of user edit aspects as inputs, including the history of edit reversion information, edit page titles and categories. With M-LSTM, we can encode each user into a low dimensional real vector, called user embedding. Meanwhile, as a sequential model, M-LSTM updates the user embedding each time after the user commits a new edit. Thus, we can predict whether a user is benign or vandal dynamically based on the up-to-date user embedding. Furthermore, those user embeddings are crucial to discover collaborative vandals.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/18/2019

Societal Controversies in Wikipedia Articles

Collaborative content creation inevitably reaches situations where diffe...
research
01/12/2018

Can Who-Edits-What Predict Edit Survival?

The Internet has enabled the emergence of massive online collaborative p...
research
01/25/2022

Attention-Based Vandalism Detection in OpenStreetMap

OpenStreetMap (OSM), a collaborative, crowdsourced Web map, is a unique ...
research
10/19/2017

Reti bayesiane per lo studio del fenomeno degli incidenti stradali tra i giovani in Toscana

This paper aims to analyse adolescents' road accidents in Tuscany. The a...
research
10/27/2022

Leveraging Wikidata's edit history in knowledge graph refinement tasks

Knowledge graphs have been adopted in many diverse fields for a variety ...
research
08/28/2018

Learning To Split and Rephrase From Wikipedia Edit History

Split and rephrase is the task of breaking down a sentence into shorter ...
research
01/17/2018

Interactive in-base street model edit: how common GIS software and a database can serve as a custom Graphical User Interface

Our modern world produces an increasing quantity of data, and especially...

Please sign up or login with your details

Forgot password? Click here to reset