Data Loading Logs
fabianbosshard opened this issue · comments
Hello Yuan
Thanks for the great work and for open-sourcing everything!
I have a quesiton about traintest_mask.py
: Shouldn't end_time
be updated after the evaluation? Because I think the way it is currently implemented leads to a very large number for per_sample_data_time
direclty after the evaluation:
![image](https://private-user-images.githubusercontent.com/124302682/322417700-fcfe2957-6aab-493b-a731-1c1d5c531a68.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjA1MjY3NjMsIm5iZiI6MTcyMDUyNjQ2MywicGF0aCI6Ii8xMjQzMDI2ODIvMzIyNDE3NzAwLWZjZmUyOTU3LTZhYWItNDkzYi1hNzMxLTFjMWQ1YzUzMWE2OC5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjQwNzA5JTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI0MDcwOVQxMjAxMDNaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT1iMzc5MGMzZjhjYmU2MDhmY2VkMTA0NWIwNGUzZmRlYTg3NzA3NmI3ZDZjNDVkZTUzNTdkYTZjNmE5ZjdlY2I2JlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCZhY3Rvcl9pZD0wJmtleV9pZD0wJnJlcG9faWQ9MCJ9.jMbZsbAKKRHJQ_5oXhRtyh5K7dhK2f__8DrfU4Wkf64)
At first I thought it was because of cache or something similar, but I think it is because the evaluation time is also included in per_sample_data_time
in the first iteration after the evaluation.
Best Regards,
Fabian
hi Fabian,
Thanks for pointing out this, it might be a bug, but it does not impact the performance, right? I just use this to check I/O and network bottleneck.
-Yuan
Thank you for the quick response. No don't worry, it has no impact on performance, the variable is onnly used to check I/O bottleneck, as you said!