eric-moreno / IN

Interaction Graph Network for Hbb

Home Page:https://arxiv.org/abs/1909.12285

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

The accuracy of random forest is over 0.99

mk123qwe opened this issue · comments

I fit the easy random forest model,just like this
from sklearn.ensemble import RandomForestClassifier
RandomForestClassifier(n_estimators=10, random_state=2019)

TABLE IV. High-level features in yours paper

[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
[Parallel(n_jobs=1)]: Done 16 out of 16 | elapsed: 3.2min finished
[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
[Parallel(n_jobs=1)]: Done 16 out of 16 | elapsed: 1.2s finished
Validation Accuracy: 0.996

How much of the training data did you use?

I tried just using 1 file (~20k events, so granted it might not be enough), and only got up to ~80% test accuracy. On the other hand, if I use the training data, then it's >99% accuracy.

How is the validation accuracy defined here?

My code here: https://github.com/jmduarte/HiggsToBBMachineLearning/blob/randomforest/train.ipynb
Binder link: https://mybinder.org/v2/gh/jmduarte/HiggsToBBMachineLearning/randomforest?filepath=train.ipynb

Thanks,
Javier