tofti / robustrankorder

A Java implementation of the Robust Rank Order statistical significance test

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

robustrankorder

The Robust Rank Order (RRO) is a non-parametric statistical significance test used as an alternative to the more widely known Wilcoxon-Mann-Whitney test. The Wilcoxon-Mann-Whitney test makes assumptions about the underlying distributions of samples, namely that they are drawn from distributions with the same second, third (kurtosis), fourth (skewness), and high order moments.

In 2003 Nick Feltovitch published Nonparametric Tests of Differences in Medians: Comparison of the Wilcoxon–Mann–Whitney and Robust Rank-Order Tests highlighting the weaknesses of the Wilcoxon Mann Whitney test, and presenting the RRO test as an alternative. Nick also published a table of significant values for the RRO test over a large range of sample sizes, thus making the test viable for many practitioners. Dave Cliff used the test in his ZIP60 paper and has advocated its use in agent based computational economics experiments (Dave was my PhD examiner). I implemented this test and used it during my PhD, so I make the code and an explanation (my interpretation of the test) available here. I corresponded with Nick on the correctness of my implementation using some of his sample data, it appears that my implementation is valid (the test is very straightforward once understood).

The basic idea behind the RRO test is that for two samples, each observation has its rank computed against the other sample by counting the number of observations in the other sample which have lower values than the observation. The arithematic mean and variance of these ranks can be computed for each sample (symmetrically for the two samples). The difference in the mean ranks across samples is scaled by the variance in ranks to compute a score, effectively measuring the difference in mean ranks scaled by sample size and the variance of the ranks. More formally for two samples x of size m and y of size n, for sample x, each observation xi for i to m has its rank in y denoted uxy. Similarily uyx denotes the rank of observation x of y for y in x. Thus the mean ranks of observations of x in y, and observation of y in x are

meanrankorder

it follows that the variances of ranks of x in y, and y in x are:

variancerankorder

Finally the RRO can be computed,

rro.

Tables of significant values for rro can be found here. This program computes RRO, the values of uyx & uxy, the variances, and rro.

The unit test uses the sample data from the paper,

x = {5.025,6.7,6.725,6.75,7.05,7.25,8.375}
y = {4.875,5.125,5.225,5.425,5.55,5.75,5.925,6.125}

giving rro= 3, and can be checked against the worked example in Section 2.2 of Nick's paper. Feel free to contact me with any questions regarding this implementation.

About

A Java implementation of the Robust Rank Order statistical significance test

License:GNU Lesser General Public License v3.0


Languages

Language:Java 100.0%