This is an academic project to detect author gender from persian corpora.
For generating features and train a model run:
> python train.py --verbose --output train
For test the model run:
> python test.py --verbose --output test
For K-fold test run:
> python test_k_fold.py --verbose --output test
Generating/Loading Features ...
Normalizing Features ...
337 features selected and saved in ../data/selected_features.npy.
K fold results:
| index | Fold1 | Fold2 | Fold3 | Fold4 | Fold5 | Fold6 | Fold7 | Fold8 | Fold9 | Fold10 | mean |
|:----------|---------:|---------:|---------:|---------:|---------:|---------:|---------:|---------:|---------:|---------:|---------:|
| precision | 0.52 | 0.75 | 0.647059 | 0.631579 | 0.590909 | 0.590909 | 0.666667 | 0.833333 | 0.526316 | 0.705882 | 0.646265 |
| recall | 0.684211 | 0.705882 | 0.647059 | 0.666667 | 0.619048 | 0.619048 | 0.608696 | 0.681818 | 0.5 | 0.6 | 0.633243 |
| f1 | 0.590909 | 0.727273 | 0.647059 | 0.648649 | 0.604651 | 0.604651 | 0.636364 | 0.75 | 0.512821 | 0.648649 | 0.637102 |
| accuracy | 0.526316 | 0.763158 | 0.684211 | 0.657895 | 0.552632 | 0.552632 | 0.578947 | 0.736842 | 0.5 | 0.657895 | 0.621053 |