nlp-with-transformers / notebooks

Jupyter notebooks for the Natural Language Processing with Transformers book

Home Page:https://transformersbook.com/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

optuna hyperparameter optimization for NER task on knowledge distillation

Venkatesh3132003 opened this issue · comments

Information

The problem arises in chapter:

  • Making Transformers Efficient in Production

Describe the bug

while training i am getting proper F1 score of 0.755940
image

while finding best fit value of alpha and temperature value for NER task f1 score is 0.096029 which is less than 0.1
image

To Reproduce

Steps to reproduce the behavior:

1.compute metric is same as chapter 4 of NER
2.Hyperparameter are for alpha and temperature

def hp_space(trial):
return {"alpha": trial.suggest_float("alpha", 0, 1),
"temperature": trial.suggest_int("temperature", 2, 20)}

best_run = distil_roberta_trainer.hyperparameter_search(
n_trials=12, direction="maximize",backend="optuna", hp_space=hp_space)

Expected behavior

After the hyperparameter search the F1 score should be higher than baseline.

When alpha is 1 F1 score is good and for any value of alpha between 0 and 1 F1 score is less than 0.1