Current Performance
abojchevski opened this issue · comments
Aleksandar Bojchevski commented
- Used base + all iterations up to and including iteration 51
- 638 documents in total
- Subclass distribution: Counter({0: 4148, 1: 740, 2: 217})
- Stratified split 2/3 + 1/3 into: train: 397, test: 241
Performance:
SUBCLASS 0 p:0.8453 r:0.8079 f:0.8262 strictness:exact
SUBCLASS 1 p:0.3072 r:0.3333 f:0.3197 strictness:exact
SUBCLASS 2 p:0.3684 r:0.3925 f:0.3801 strictness:exact
TOTAL p:0.7463 r:0.7291 f:0.7376 strictness:exact
SUBCLASS 0 p:0.9368 r:0.8911 f:0.9134 strictness:overlapping
SUBCLASS 1 p:0.7543 r:0.7792 f:0.7665 strictness:overlapping
SUBCLASS 2 p:0.7153 r:0.7464 f:0.7305 strictness:overlapping
TOTAL p:0.8940 r:0.8661 f:0.8798 strictness:overlapping
Aleksandar Bojchevski commented
Here is a detailed output of the predictions:
https://gist.github.com/abojchevski/8b7d27729e53db82640b
Aleksandar Bojchevski commented
Here are the top model features and transitions:
https://gist.github.com/abojchevski/5a251f6c08a3049aac2c
Dr. Juan Miguel Cejuela commented
👍
Aleksandar Bojchevski commented
Including RegexNLFeatureGenerator (should be named deletion)
SUBCLASS 0 p:0.8433 r:0.8042 f:0.8233 strictness:exact
SUBCLASS 1 p:0.3010 r:0.3298 f:0.3147 strictness:exact
SUBCLASS 2 p:0.3860 r:0.4112 f:0.3982 strictness:exact
TOTAL p:0.7439 r:0.7265 f:0.7351 strictness:exact
SUBCLASS 0 p:0.9362 r:0.8887 f:0.9118 strictness:overlapping
SUBCLASS 1 p:0.7603 r:0.7870 f:0.7734 strictness:overlapping
SUBCLASS 2 p:0.7292 r:0.7609 f:0.7447 strictness:overlapping
TOTAL p:0.8950 r:0.8660 f:0.8803 strictness:overlapping
Aleksandar Bojchevski commented
Training with Elastic Net (L1 + L2) regularization
And new post-processing rule
SUBCLASS 0 p:0.8318 r:0.8047 f:0.8180 strictness:exact
SUBCLASS 1 p:0.2809 r:0.3227 f:0.3003 strictness:exact
SUBCLASS 2 p:0.3670 r:0.3738 f:0.3704 strictness:exact
TOTAL p:0.7297 r:0.7243 f:0.7270 strictness:exact
SUBCLASS 0 p:0.9301 r:0.8932 f:0.9113 strictness:overlapping
SUBCLASS 1 p:0.7626 r:0.8127 f:0.7868 strictness:overlapping
SUBCLASS 2 p:0.7569 r:0.7730 f:0.7649 strictness:overlapping
TOTAL p:0.8915 r:0.8739 f:0.8826 strictness:overlapping
And old post-processing rule
SUBCLASS 0 p:0.8318 r:0.8047 f:0.8180 strictness:exact
SUBCLASS 1 p:0.2817 r:0.3227 f:0.3008 strictness:exact
SUBCLASS 2 p:0.3619 r:0.3551 f:0.3585 strictness:exact
TOTAL p:0.7305 r:0.7234 f:0.7269 strictness:exact
SUBCLASS 0 p:0.9301 r:0.8932 f:0.9113 strictness:overlapping
SUBCLASS 1 p:0.7615 r:0.8098 f:0.7849 strictness:overlapping
SUBCLASS 2 p:0.7643 r:0.7589 f:0.7616 strictness:overlapping
TOTAL p:0.8920 r:0.8727 f:0.8823 strictness:overlapping
Dr. Juan Miguel Cejuela commented
Nothing to do here. We will indeed use Elastic Net