huggingface trainer hook calls task.close() prematurely
nkgrush opened this issue · comments
Describe the bug
Huggingface Trainer class is integrated with clearml. When trainer.train() finishes (successfully), the trainer calls task.close(), making original clearml task unavailable. I am refering to this line specifically (permalink).
To reproduce
task = Task.init(
project_name='project',
task_name='task',
)
...
model = ...
dataset = ...
...
from transformers import Trainer
trainer_args = ...
trainer = SFTTrainer(
model,
train_dataset=dataset,
args=trainer_args,
)
print(task.status) # Running
trainer.train()
print(task.status) # Completed
# now the task object is dead for the most purposes
Expected behaviour
The main task should not be closed (making it unavailable) after the training is finished. This is especially important if there are multiple trainer runs or any custom actions are taken after training.
Environment
Independent
Hi @nkgrush ! We have submitted a PR to huggingface related to this issue: huggingface/transformers#26614