Add a tensorboard log for final training loss value.

Question

Add a tensorboard log for final training loss value.

rnyak opened this issue 5 years ago · comments

Describe the bug
When I check the loss plot on TensorBoard, I see that validation steps are much higher than training steps. See the screenshot below. why training ends 1062 steps before the validation steps? What's the logic behind?

Minimum Reproducible Example
A short code snippet which reproduces the exception

Expected behavior
A clear and concise description of what you expected to happen.

Additional context
Add any other context about the problem here.

rnyak · Answer 1 · Thu Dec 19 2019 06:49:03 GMT+0800 (China Standard Time)

@benleetownsend any explanation on that? Thanks in advance.

benleetownsend · Answer 2 · Tue Dec 24 2019 19:22:51 GMT+0800 (China Standard Time)

So, we explicitly run validation on the final model no matter the val interval. This is for the purpose of keep_best_model, otherwise we can accidentally waste the final set of steps. We just follow the default logging schedules for the training loss. If you wanted to track loss values near the end of training you could change your val_interval such that a final value will come near the end of training.

I'm going to rename this issue to track the feature of adding a final step loss log for training

Hopefully this helps/answers your question.