bouchard / thumbs_up

Dead-Simple Vote and Karma Management

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Feature Request: Wilson Score method

dleatham opened this issue · comments

I've noticed that sorting on votable records with 'plusminus' often leaves me with strange results, depending on the number of votes one record may have in relation to another. Posts with a lower percentage of positive votes could be rated higher due to fact they have been around longer and have a larger pool of votes. I saw an article that clearly describes this problem and offers up a solution to it.

http://evanmiller.org/how-not-to-sort-by-average-rating.html

He bases the answer on a mathematical calculation called a "Wilson Score". The Ruby code he provides with explanation is as follows:

require 'statistics2'

def ci_lower_bound(pos, n, confidence)
    if n == 0
        return 0
    end
    z = Statistics2.pnormaldist(1-(1-confidence)/2)
    phat = 1.0*pos/n
    (phat + z*z/(2*n) - z * Math.sqrt((phat*(1-phat)+z*z/(4*n))/n))/(1+z*z/n)
end

pos is the number of positive ratings, n is the total number of ratings, and confidence refers to the statistical confidence level: pick 0.95 to have a 95% chance that your lower bound is correct, 0.975 to have a 97.5% chance, etc. The z-score in this function never changes, so if you don't have a statistics package handy or if performance is an issue you can always hard-code a value here for z. (Use 1.96 for a confidence level of 0.95.)*

This would be a nice add to plusminus as a way to accurately measure popularity. I would volunteer, but I'm a part-time amateur at best, and no one (including me) would want my code going into production. (I'm a business person that does very basic MVPs.)

Already in the codebase, just hasn't been pushed yet ;)

Pushed and released as 0.5.4... please feel free to edit the README and send a pull request if you have a chance!