ORES

Caution: Information on this page is outdated. See phab:T419148.

The ORES infrastructure is being deprecated in favor of Machine Learning/LiftWing, as described in Machine_Learning/Modernization. The Machine Learning team has the following high level timeline:

Deploy all ORES/Revscoring models on Lift Wing.
- Please check Machine_Learning/LiftWing/API about how to query the new models (from the internal WMF network, from the outside Internet, and from the Wikimedia Cloud Infra).
- All ORES models are currently being served by Lift Wing!
Build a Kubernetes service called ores-legacy, that offers the same interface as https://ores.wikimedia.org/ but that calls Lift Wing in the background (more details in https://phabricator.wikimedia.org/T330414).
- DONE, the endpoint is live: https://ores-legacy.wikimedia.org/docs
Move the ORES Mediawiki Extension to Lift Wing (same interface and configuration but different backend called).
- DONE, more info in https://phabricator.wikimedia.org/T319170
Deprecation of the revision-score stream from https://stream.wikimedia.org/. We are going to provide streams related to single model scores, rather than a single one. If you need a particular stream for a specific model, please ping the Machine Learning team.
- DONE
Move https://ores.wikimedia.org/ to https://ores-legacy.wikimedia.org (via DNS CNAME) so users that have not yet migrated to Lift Wing will be transparently migrated.
- DONE
Migrate all clients to the Lift Wing API. This is a long term process, and it will likely take several months.
- IN PROGRESS
Decommission https://ores-legacy.wikimedia.org. After this deadline all clients will need to use Lift Wing, no more ores-like APIs will be available.
- Scheduled for 2024 (we don't have a clear timeline yet).

Machine Learning contacts

For any question/doubt/etc.. please reach out to us:

Email: ml@wikimedia.org
Phabricator: #Machine-Learning-Team project tag
IRC (Libera): #wikimedia-ml

Bot/tool owners

If you are an owner of a tool that uses ORES (bots, dashboards, etc..) or if you have a service that depends on ORES and you have concerns / doubts / questions, we invite you to:

Read the ORES to Lift Wing migration guide to understand the main differences between ORES and Lift Wing, and how to migrate.
Read about the new Revert Risk language agnostic model that the WMF Research team released in these months, as replacement for both goodfaith and damaging ones.
Read the docs in Machine_Learning/LiftWing/API/External_usage to figure out how to use Lift Wing.
Create a Phabricator task with the Machine-Learning-Team project tag to start a conversation with us, hopefully we'll be able to help in making your transition to Lift Wing easy and smooth!
Ping us in the #wikimedia-ml ^connect IRC channel on Libera Chat.

Streams deprecation

The revision-score stream (available via https://stream.wikimedia.org/) contains, for each revision-id of most of the wikis, a list of scores related to multiple models. For example, rev-id 123456 from enwiki will be associated with the scores from goodfaith, damaging, reverted, etc.. This way of doing thing is not great in terms of maintainability, since is makes it very hard to disentangle/deprecated/etc.. models from the stream. It makes also harder to track down users of specific models, since the consumers are very generic and they don't carry any indication of their data interest (for example, a client could consume the whole revision-score stream only to get goodfaith scores).

The Machine Learning team is planning to deprecate the revision-score stream, and to create smaller streams (one for each model) as requested from the community. Please reach out to us if you use the revision-score stream so we can figure out how to proceed!

Model deprecation

The following ORES revscoring-based models are also being deprecated:

editquality goodfaith
editquality damaging
editquality reverted

With "deprecated" we mean that the models will be available on Lift Wing for the time being, but the ML team is not going to improve them further (re-training, add support for new Wikis, new data labeling, etc..). We are not going to remove them from Lift Wing without a community consultation/approval.

As stated above, these models have been deployed to Lift Wing so they are available for the time being, but if you rely on them we suggest to follow up with the ML team in a Phabricator task (with the #Machine-Learning-Team project tag).

We'd like to move clients to more modern models (see below) as soon as possible. Tentative deadline: January 2025.

The Research team created a new family of models to replace the functionality of the editquality ones (goodfaith, damaging and reverted), calling them Revert Risk. The idea is to have a single score instead of multiple ones, and there are several reasons for that: for example, ORES relies on (relatively) small manually annotated data (which is good in terms of precision) that makes very difficult to retrain the models, add new languages and capture data/behavior drifts. The idea with Revert Risk models is to use revision reverts as "implicit annotations", allowing us to train on large data and for all languages. Also, we noticed that there was a huge (inverse) correlation between goodfaith and damaging models, so basically people tend to think about goodfaith as 1-damaging. The Revert Risk models capture different signals, but the final intent is the same, so you can assume similar usage.

If we consider the damaging model as prediction for reverts (which makes sense, because damaging revisions must be reverted), Revert Risk is outperforming ORES in almost all scenarios. That being said, if we check revisions by revision, there may be cases where ORES captures certain vandalism that Revert Risk is still missing. We have built a version that solve that issue (the Multilingual Revert Risk), that relies on Large Language Models (same family of models used by ChatGPT), but the serving time for this still slow (1s as median), so we are working on making this faster and making the simpler model (called Language Agnostic Revert Risk) better on catching certain types of vandalism that we are currently missing.

To summarize: the ML and Research teams suggest to use the aforementioned Revert Risk models instead of the revscoring editquality ones for any current or future project in the Wikimedia community. If you want to migrate over to the new models, please create a task in Phabricator with the #Machine-Learning-Team project tag and we'll help you!

Where can I find revscoring model binaries?

The ML team collects all models deployed on Lift Wing in https://analytics.wikimedia.org/published/wmf-ml-models/

Guide to migrate from ORES to Lift Wing

See Machine_Learning/LiftWing/API/ORES_migration_guide.

Old ORES documentation

Manuals

Deployment guide


Incidents

Incident documentation/20150908-ores (labs)
Incident documentation/20151216-ores (labs)
Incident documentation/20160319-Ores (labs)
Incident documentation/20160610-ORES (prod, early days)
Incident documentation/20160620-ores (prod)
Incident documentation/20160801-ORES (prod)
Incident_documentation/20160924-ORES (prod)
Incident_documentation/20160925-ores (prod)
Incident_documentation/20161227-ores (prod)
Incident documentation/20171120-Ext:ORES


See also