Note: in the following, X
is going to be 0 or 1.
- Fill the
Software dev Expertise
column incohort.202X.csv
with integer values between 0 and 2 - run
to make 4 groups. Change 4->N if you want N groups. Note: with less than 4 groups, we get institution/theme repetitions.
./make_groups.py 4 cohort.202X.csv
- A plot with diagnostics of the simulated annealing is going to be shown, and the teams.
- All the parameters
can be tweaked inside
make_groups.py
. - the cost function can be tweaked
in the
cost.py
file. There is a contribution due to pair interactions (which can be split between categorical and numerical dimensions) and contributions due to "global" imbalances.
All data in cohort.202X.csv
comes from http://cdt-aimlac.org/cdt-cohorts.html
and is thus publicly available,
except from the Software dev Expertise
column
which is random data.