optuna hyperparameter optimization for NER task on knowledge distillation
Venkatesh3132003 opened this issue · comments
Information
The problem arises in chapter:
- Making Transformers Efficient in Production
Describe the bug
while training i am getting proper F1 score of 0.755940
while finding best fit value of alpha and temperature value for NER task f1 score is 0.096029 which is less than 0.1
To Reproduce
Steps to reproduce the behavior:
1.compute metric is same as chapter 4 of NER
2.Hyperparameter are for alpha and temperature
def hp_space(trial):
return {"alpha": trial.suggest_float("alpha", 0, 1),
"temperature": trial.suggest_int("temperature", 2, 20)}
best_run = distil_roberta_trainer.hyperparameter_search(
n_trials=12, direction="maximize",backend="optuna", hp_space=hp_space)
Expected behavior
After the hyperparameter search the F1 score should be higher than baseline.
When alpha is 1 F1 score is good and for any value of alpha between 0 and 1 F1 score is less than 0.1