ionicsolutions / kokolores

ORES models for automatic review of edits on Wikipedia

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

kokolORES

Exploring the use of data from Flagged Revisions to create ORES models for the automatic review of edits on Wikipedia.

What are Flagged Revisions?

Flagged Revisions is a MediaWiki extension used by some language versions of Wikipedia and other Wikimedia projects to organize the review of edits.

There are several different ways in which Flagged Revisions are used on Wikipedias. Some — most notably the German-language Wikipedia — require review of all edits by editors with no or a newly registered user account before they are shown to the general public, whereas many others only flag certain pages for review (e.g. the English-language Wikipedia's pending changes configuration). An overview of configurations can be seen here on meta.wikimedia.org (2016), the latest configuration can be found in flaggedrevs.php.

What is ORES?

The Objective Revision Evaluation Service is a web service run by the Wikimedia Foundation's Scoring Platform team. ORES provides different machine learning models to score edits made to Wikipedia. A comprehensive description of ORES, the motivation of its creators, and a description of some use cases can be found in this 2018 paper.

Why this project?

Data recorded through the use of Flagged Revisions constitutes a large, human-labeled dataset of accepted and rejected edits. Especially for Wikipedia language versions where a large number of edits have been reviewed, this is a potentially valuable resource for training machine learning models to judge edit quality. In a first step, kokolores will focus on the compilation and documentation of data sets from Flagged Revisions.

The use of Flagged Revisions — especially in the 'German' configuration where only reviewed edits are shown to the general public — has long been controversial for a variety of reasons which are summarized in the Wikimedia Foundation's request for comment regarding new deployments of Flagged Revisions. One of the problems faced by Wikipedias using Flagged Revisions is that the queue of yet-unreviewed edits grows very large and the need to review a large number of edits is a burden on the volunteer communities, most of which are shrinking. Having to wait for hours or even days for an edit to be visible can also be very frustrating for new editors.

For these Wikipedias, it has been proposed to use ORES to auto-review at least some of the edits to reduce the backlog (T165848). In a second step, kokolores will investigate whether models trained on Flagged Revisions data can help to substantially reduce the backlog of edits awaiting review while maintaining a high level of accuracy. It will be especially interesting to compare their performance to models trained the general ORES edit quality data sets compiled through Wiki labels. (A first investigation of a model trained on a subset of Flagged Revisions data from the Finnish-language Wikipedia is documented in T166235.)

If successful, in a potential third step, kokolores will facilitate the development of an interface between ORES and Flagged Revisions. This could be a bot which is granted the right to review pages, or a more direct means.

This third step would open up the possibility to use ORES for Flagged Revisions on Wikipedias using the 'English' approach where only edits to specially marked pages require review. For this group of Wikipedias, the use of ORES to mark edits for review has been proposed in 2016 (T150593), but the corresponding patch was not deployed as the underlying Deferred Changes process (T118696) was never adopted by the English-language Wikipedia's community. A similar proposal which asks to flag edits for review based on ORES scores can be found in T132901. Note that in these cases, rather than using models trained on Flagged Revisions data, the full edit quality models already present in ORES are expected to be more suitable.

In the long run, systems like ORES might allow for all Wikipedias to adopt a refined variant of the English Wikipedia's 'pending changes' process, where an AI flags only those edits for review which indeed require human intervention, while outright rejecting edits that are clearly harmful (so-called vandalism or editing mistakes) and immediately approving edits which benefit Wikipedia, even if they are made to an otherwise controversial article or by a new editor.

About

ORES models for automatic review of edits on Wikipedia

License:MIT License


Languages

Language:Python 92.7%Language:HTML 7.3%