finetuning on v4-32 dies suddenly
shankerabhigyan opened this issue · comments
Abhigyan Shanker commented
Finetuning for gemma dies suddenly after about 12 hours. There are no warnings or messages in the output logs, the process is just killed.
The script https://ai.google.dev/gemma/docs/distributed_tuning was being run using nohup.
What could be some possible debugging steps or is this a server-side problem?
I experienced the same behaviour in v3-8 devices.