Run lumi eval on imported checkpoint
samkit-jain opened this issue · comments
How do I specify the checkpoint to use in lumi eval
?
Background: I did the training on a different machine. Exported the checkpoint from there and imported the checkpoint in a different machine. Now, I want to run lumi eval
for the imported checkpoint on the machine where the checkpoint was imported. I can extract the contents of exported tar file in the jobs folder and then run the eval. Is there a better way to just specify the checkpoint in the command itself?
There doesn't seem to be an option, see what lumi eval --help
tells you. I would also like to see a better direct interaction with the TensorFlow checkpoints here. Even better: An option to periodically run the evaluation during training (hot, without stopping training first)