Rostlab / nala

Text mining of natural language mutations mentions

Home Page:https://www.tagtog.net/-corpora/IDP4+

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

IDP4+_training vs nala_training (in CV)

abojchevski opened this issue · comments

nala_training_51

fold 0

tp:135  fp:60   fn:55   fpo:34  fno:36  P:0.6923        R:0.7105        F:0.7013        0       exact
tp:30   fp:60   fn:65   fpo:45  fno:43  P:0.3333        R:0.3158        F:0.3243        1       exact
tp:13   fp:9    fn:18   fpo:7   fno:9   P:0.5909        R:0.4194        F:0.4906        2       exact
tp:178  fp:129  fn:138  fpo:86  fno:88  P:0.5798        R:0.5633        F:0.5714        TOTAL   exact

tp:135  fp:60   fn:55   fpo:34  fno:36  P:0.8874        R:0.9152        F:0.9011        0       overlapping
tp:30   fp:60   fn:65   fpo:45  fno:43  P:0.8872        R:0.8429        F:0.8645        1       overlapping
tp:13   fp:9    fn:18   fpo:7   fno:9   P:0.9355        R:0.7632        F:0.8406        2       overlapping
tp:178  fp:129  fn:138  fpo:86  fno:88  P:0.8911        R:0.8756        F:0.8833        TOTAL   overlapping

fold 1

tp:96   fp:30   fn:30   fpo:13  fno:13  P:0.7619        R:0.7619        F:0.7619        0       exact
tp:30   fp:84   fn:82   fpo:72  fno:60  P:0.2632        R:0.2679        F:0.2655        1       exact
tp:16   fp:19   fn:20   fpo:12  fno:12  P:0.4571        R:0.4444        F:0.4507        2       exact
tp:142  fp:133  fn:132  fpo:97  fno:85  P:0.5164        R:0.5182        F:0.5173        TOTAL   exact

tp:96   fp:30   fn:30   fpo:13  fno:13  P:0.8777        R:0.8777        F:0.8777        0       overlapping
tp:30   fp:84   fn:82   fpo:72  fno:60  P:0.9310        R:0.8804        F:0.9050        1       overlapping
tp:16   fp:19   fn:20   fpo:12  fno:12  P:0.8511        R:0.8333        F:0.8421        2       overlapping
tp:142  fp:133  fn:132  fpo:97  fno:85  P:0.9000        R:0.8733        F:0.8865        TOTAL   overlapping

fold 2

tp:125  fp:32   fn:38   fpo:15  fno:12  P:0.7962        R:0.7669        F:0.7812        0       exact
tp:25   fp:50   fn:77   fpo:42  fno:40  P:0.3333        R:0.2451        F:0.2825        1       exact
tp:7    fp:11   fn:17   fpo:9   fno:9   P:0.3889        R:0.2917        F:0.3333        2       exact
tp:157  fp:93   fn:132  fpo:66  fno:61  P:0.6280        R:0.5433        F:0.5826        TOTAL   exact

tp:125  fp:32   fn:38   fpo:15  fno:12  P:0.8994        R:0.8539        F:0.8761        0       overlapping
tp:25   fp:50   fn:77   fpo:42  fno:40  P:0.9304        R:0.7431        F:0.8263        1       overlapping
tp:7    fp:11   fn:17   fpo:9   fno:9   P:0.9259        R:0.7576        F:0.8333        2       overlapping
tp:157  fp:93   fn:132  fpo:66  fno:61  P:0.9132        R:0.8000        F:0.8529        TOTAL   overlapping

fold 3

tp:234  fp:46   fn:55   fpo:26  fno:26  P:0.8357        R:0.8097        F:0.8225        0       exact
tp:46   fp:76   fn:91   fpo:67  fno:59  P:0.3770        R:0.3358        F:0.3552        1       exact
tp:7    fp:12   fn:15   fpo:6   fno:7   P:0.3684        R:0.3182        F:0.3415        2       exact
tp:287  fp:134  fn:161  fpo:99  fno:92  P:0.6817        R:0.6406        F:0.6605        TOTAL   exact

tp:234  fp:46   fn:55   fpo:26  fno:26  P:0.9346        R:0.9079        F:0.9211        0       overlapping
tp:46   fp:76   fn:91   fpo:67  fno:59  P:0.9503        R:0.8431        F:0.8935        1       overlapping
tp:7    fp:12   fn:15   fpo:6   fno:7   P:0.7692        R:0.7143        F:0.7407        2       overlapping
tp:287  fp:134  fn:161  fpo:99  fno:92  P:0.9318        R:0.8739        F:0.9019        TOTAL   overlapping

fold 4

tp:183  fp:41   fn:60   fpo:25  fno:27  P:0.8170        R:0.7531        F:0.7837        0       exact
tp:53   fp:85   fn:95   fpo:75  fno:65  P:0.3841        R:0.3581        F:0.3706        1       exact
tp:19   fp:22   fn:31   fpo:16  fno:20  P:0.4634        R:0.3800        F:0.4176        2       exact
tp:255  fp:148  fn:186  fpo:116 fno:112 P:0.6328        R:0.5782        F:0.6043        TOTAL   exact

tp:183  fp:41   fn:60   fpo:25  fno:27  P:0.9363        R:0.8769        F:0.9056        0       overlapping
tp:53   fp:85   fn:95   fpo:75  fno:65  P:0.9507        R:0.8655        F:0.9061        1       overlapping
tp:19   fp:22   fn:31   fpo:16  fno:20  P:0.9016        R:0.8333        F:0.8661        2       overlapping
tp:255  fp:148  fn:186  fpo:116 fno:112 P:0.9379        R:0.8671        F:0.9011        TOTAL   overlapping

IDP4+ training

fold 0

tp:592  fp:148  fn:110  fpo:91  fno:61  P:0.8000        R:0.8433        F:0.8211        0       exact
tp:27   fp:89   fn:121  fpo:60  fno:56  P:0.2328        R:0.1824        F:0.2045        1       exact
tp:13   fp:21   fn:36   fpo:10  fno:11  P:0.3824        R:0.2653        F:0.3133        2       exact
tp:632  fp:258  fn:267  fpo:161 fno:128 P:0.7101        R:0.7030        F:0.7065        TOTAL   exact

tp:592  fp:148  fn:110  fpo:91  fno:61  P:0.9288        R:0.9382        F:0.9335        0       overlapping
tp:27   fp:89   fn:121  fpo:60  fno:56  P:0.8314        R:0.6875        F:0.7526        1       overlapping
tp:13   fp:21   fn:36   fpo:10  fno:11  P:0.7556        R:0.5763        F:0.6538        2       overlapping
tp:632  fp:258  fn:267  fpo:161 fno:128 P:0.9047        R:0.8689        F:0.8864        TOTAL   overlapping

fold 1

tp:615  fp:112  fn:166  fpo:73  fno:70  P:0.8459        R:0.7875        F:0.8156        0       exact
tp:35   fp:82   fn:105  fpo:59  fno:55  P:0.2991        R:0.2500        F:0.2724        1       exact
tp:12   fp:18   fn:26   fpo:9   fno:11  P:0.4000        R:0.3158        F:0.3529        2       exact
tp:662  fp:212  fn:297  fpo:141 fno:136 P:0.7574        R:0.6903        F:0.7223        TOTAL   exact

tp:615  fp:112  fn:166  fpo:73  fno:70  P:0.9511        R:0.8876        F:0.9182        0       overlapping
tp:35   fp:82   fn:105  fpo:59  fno:55  P:0.8663        R:0.7487        F:0.8032        1       overlapping
tp:12   fp:18   fn:26   fpo:9   fno:11  P:0.7805        R:0.6809        F:0.7273        2       overlapping
tp:662  fp:212  fn:297  fpo:141 fno:136 P:0.9297        R:0.8536        F:0.8900        TOTAL   overlapping

fold 2

tp:48   fp:93   fn:105  fpo:63  fno:57  P:0.3404        R:0.3137        F:0.3265        1       exact
tp:18   fp:33   fn:27   fpo:16  fno:17  P:0.3529        R:0.4000        F:0.3750        2       exact
tp:782  fp:237  fn:220  fpo:130 fno:130 P:0.7674        R:0.7804        F:0.7739        TOTAL   exact

tp:716  fp:111  fn:88   fpo:51  fno:56  P:0.9320        R:0.9626        F:0.9471        0       overlapping
tp:48   fp:93   fn:105  fpo:63  fno:57  P:0.8485        R:0.7778        F:0.8116        1       overlapping
tp:18   fp:33   fn:27   fpo:16  fno:17  P:0.7500        R:0.8361        F:0.7907        2       overlapping
tp:782  fp:237  fn:220  fpo:130 fno:130 P:0.9069        R:0.9205        F:0.9136        TOTAL   overlapping

fold 3

tp:1073 fp:307  fn:162  fpo:93  fno:95  P:0.7775        R:0.8688        F:0.8207        0       exact
tp:51   fp:113  fn:118  fpo:70  fno:66  P:0.3110        R:0.3018        F:0.3063        1       exact
tp:14   fp:37   fn:25   fpo:16  fno:15  P:0.2745        R:0.3590        F:0.3111        2       exact
tp:1138 fp:457  fn:305  fpo:179 fno:176 P:0.7135        R:0.7886        F:0.7492        TOTAL   exact

tp:1073 fp:307  fn:162  fpo:93  fno:95  P:0.8549        R:0.9495        F:0.8998        0       overlapping
tp:51   fp:113  fn:118  fpo:70  fno:66  P:0.8130        R:0.7824        F:0.7974        1       overlapping
tp:14   fp:37   fn:25   fpo:16  fno:15  P:0.6818        R:0.8182        F:0.7438        2       overlapping
tp:1138 fp:457  fn:305  fpo:179 fno:176 P:0.8430        R:0.9205        F:0.8800        TOTAL   overlapping

fold 4

tp:465  fp:65   fn:163  fpo:40  fno:43  P:0.8774        R:0.7404        F:0.8031        0       exact
tp:48   fp:74   fn:85   fpo:59  fno:52  P:0.3934        R:0.3609        F:0.3765        1       exact
tp:13   fp:21   fn:33   fpo:13  fno:15  P:0.3824        R:0.2826        F:0.3250        2       exact
tp:526  fp:160  fn:281  fpo:112 fno:110 P:0.7668        R:0.6518        F:0.7046        TOTAL   exact

tp:465  fp:65   fn:163  fpo:40  fno:43  P:0.9564        R:0.8204        F:0.8832        0       overlapping
tp:48   fp:74   fn:85   fpo:59  fno:52  P:0.9138        R:0.8281        F:0.8689        1       overlapping
tp:13   fp:21   fn:33   fpo:13  fno:15  P:0.8367        R:0.6949        F:0.7593        2       overlapping
tp:526  fp:160  fn:281  fpo:112 fno:110 P:0.9397        R:0.8139        F:0.8723        TOTAL   overlapping

I repeated the experiment by deleting classes 1,2. IDP4+ won: F:0.8934 vs F:0.9097

# wins P:0.9372 R:0.8535    F:0.8934 -- 10 421693.1-5:1 time /mnt/home/cejuela/anaconda3/latest/bin/python /mnt/home/cejuela/nala/nala/scripts/train.py --training_corpus nala_training --delete_subclasses "1,2" --cv_n 5 --cv_fold $cv_fold --pruner parts --ps_random "0.0"
# 9 421692.1-5:1 time /mnt/home/cejuela/anaconda3/latest/bin/python /mnt/home/cejuela/nala/nala/scripts/train.py --training_corpus nala_training --delete_subclasses "1,2" --cv_n 5 --cv_fold $cv_fold --pruner sentences --ps_random "0.0"
# 8 421691.1-5:1 time /mnt/home/cejuela/anaconda3/latest/bin/python /mnt/home/cejuela/nala/nala/scripts/train.py --training_corpus nala_training --delete_subclasses "1,2" --cv_n 5 --cv_fold $cv_fold --pruner sentences --ps_ST --ps_random "0.0"
# 7 421690.1-5:1 time /mnt/home/cejuela/anaconda3/latest/bin/python /mnt/home/cejuela/nala/nala/scripts/train.py --training_corpus nala_training --delete_subclasses "1,2" --cv_n 5 --cv_fold $cv_fold --pruner sentences --ps_NL --ps_random "0.0"
# 6 421689.1-5:1 time /mnt/home/cejuela/anaconda3/latest/bin/python /mnt/home/cejuela/nala/nala/scripts/train.py --training_corpus nala_training --delete_subclasses "1,2" --cv_n 5 --cv_fold $cv_fold --pruner sentences --ps_ST --ps_NL --ps_random "0.0"

# wins P:0.9060 R:0.9135    F:0.9097 -- 5 421686.1-5:1 time /mnt/home/cejuela/anaconda3/latest/bin/python /mnt/home/cejuela/nala/nala/scripts/train.py --training_corpus IDP4+_training --delete_subclasses "1,2" --cv_n 5 --cv_fold $cv_fold --pruner parts --ps_random "0.0"
# 4 421685.1-5:1 time /mnt/home/cejuela/anaconda3/latest/bin/python /mnt/home/cejuela/nala/nala/scripts/train.py --training_corpus IDP4+_training --delete_subclasses "1,2" --cv_n 5 --cv_fold $cv_fold --pruner sentences --ps_random "0.0"
# 3 421684.1-5:1 time /mnt/home/cejuela/anaconda3/latest/bin/python /mnt/home/cejuela/nala/nala/scripts/train.py --training_corpus IDP4+_training --delete_subclasses "1,2" --cv_n 5 --cv_fold $cv_fold --pruner sentences --ps_ST --ps_random "0.0"
# 2 421683.1-5:1 time /mnt/home/cejuela/anaconda3/latest/bin/python /mnt/home/cejuela/nala/nala/scripts/train.py --training_corpus IDP4+_training --delete_subclasses "1,2" --cv_n 5 --cv_fold $cv_fold --pruner sentences --ps_NL --ps_random "0.0"
# 1 421682.1-5:1 time /mnt/home/cejuela/anaconda3/latest/bin/python /mnt/home/cejuela/nala/nala/scripts/train.py --training_corpus IDP4+_training --delete_subclasses "1,2" --cv_n 5 --cv_fold $cv_fold --pruner sentences --ps_ST --ps_NL --ps_random "0.0"

Note, run without window features nor we for greater speed