vivekjoshy / openskill.py

Multiplayer Rating System. No Friction.

Home Page:https://openskill.me

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Determining Convergence Criteria for Bradley-Terry Model

sarim-zafar opened this issue · comments

Is your feature request related to a problem? Please describe.
Currently, it's not explicitly clear how to determine convergence in the Bradley-Terry model implemented in openskill.py. Users might struggle to ascertain whether the model has converged, leading to uncertainty in the validity of the results.

Describe the solution you'd like
I propose adding documentation or guidance on determining convergence criteria for the Bradley-Terry model in openskill.py. This could include recommended thresholds or methods for assessing convergence, such as examining parameter estimates or likelihood changes over iterations.

Describe alternatives you've considered
One alternative is leaving the determination of convergence criteria to individual users, which could lead to inconsistency and confusion. Another option is relying solely on default convergence settings, but this might not be suitable for all use cases and datasets.

Additional context
Convergence is a crucial aspect of model fitting, particularly in iterative algorithms like those used in the Bradley-Terry model. Providing clear guidelines on determining convergence will enhance the usability and reliability of openskill.py for researchers and practitioners utilizing the Bradley-Terry model for skill estimation.

There is no model fitting on data here like in traditional statistical or machine learning problems. As such there is also no convergence criteria to be met either. The model implemented here uses an approximation of a generalized Bradley-Terry model with variance parameters. As such the rules are analytic, finite and closed form.

For more details, see this paper.

Does this answer your question?

My knowledge in this area is rudimentary hence the ask. DOTA uses Glicko model but has a confidence score as well. So I was just wondering if there is a way to determine it using these models too? Otherwise, I was thinking of assessing it by looking at the variance parameter squeezing down to a certain value or rate of change in the ordinal value.

Not sure what you mean exactly by a confidence score? Can you provide an example of what it would look like?

I was thinking of assessing it by looking at the variance parameter squeezing down to a certain value or rate of change in the ordinal value.

What is your use case? These ratings are relative, so it would depend if a variance threshold is what is you're looking for.