ORES: Lowering Barriers with Participatory Machine Learning in Wikipedia

09/11/2019
by   Aaron Halfaker, et al.
0

Algorithmic systems -- from rule-based bots to machine learning classifiers -- have a long history of supporting the essential work of content moderation and other curation work in peer production projects. From counter-vandalism to task routing, basic machine prediction has allowed open knowledge projects like Wikipedia to scale to the largest encyclopedia in the world, while maintaining quality and consistency. However, conversations about how quality control should work and what role algorithms should play have generally been led by the expert engineers who have the skills and resources to develop and modify these complex algorithmic systems. In this paper, we describe ORES: an algorithmic scoring service that supports real-time scoring of wiki edits using multiple independent classifiers trained on different datasets. ORES decouples several activities that have typically all been performed by engineers: choosing or curating training data, building models to serve predictions, auditing predictions, and developing interfaces or automated agents that act on those predictions. This meta-algorithmic system was designed to open up socio-technical conversations about algorithmic systems in Wikipedia to a broader set of participants. In this paper, we discuss the theoretical mechanisms of social change ORES enables and detail case studies in participatory machine learning around ORES from the 4 years since its deployment.

READ FULL TEXT
research
09/26/2017

Beyond opening up the black box: Investigating the role of algorithmic systems in Wikipedian organizational culture

Scholars and practitioners across domains are increasingly concerned wit...
research
10/22/2018

The Lives of Bots

Automated software agents --- or bots --- have long been an important pa...
research
02/15/2023

Fairness in Socio-technical Systems: a Case Study of Wikipedia

Problems broadly known as algorithmic bias frequently occur in the conte...
research
07/09/2020

Green Lighting ML: Confidentiality, Integrity, and Availability of Machine Learning Systems in Deployment

Security and ethics are both core to ensuring that a machine learning sy...
research
08/15/2021

Measuring Wikipedia Article Quality in One Dimension by Extending ORES with Ordinal Regression

Organizing complex peer production projects and advancing scientific kno...
research
05/10/2023

WikiSQE: A Large-Scale Dataset for Sentence Quality Estimation in Wikipedia

Wikipedia can be edited by anyone and thus contains various quality sent...
research
05/24/2021

Informative Bayesian model selection for RR Lyrae star classifiers

Machine learning has achieved an important role in the automatic classif...

Please sign up or login with your details

Forgot password? Click here to reset