pytorch / kineto

A CPU+GPU Profiling library that provides access to timeline traces and hardware performance counters.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Visualizing pytorch Emited NVTX with Tensorboard Profiler

jmoork opened this issue · comments

There are important Pytorch Specific markers generated during the training loop either using NVTX or other means from pytorch lightning.

It would be useful to show the time line view with execution time for different pytorch training sections of the code as shown in the attached image. The current trace view is quite detailed and goes down one-step further with detailed cuda kernels etc. It would be useful to have Pytorch Execution specific trace view that is easy to understand and intuit to find any synchronization or communication bottlenecks in the training loop and also compare different execution time for different executing region of the training/validation code (like forward, loss, data-loading etc).

Screenshot 2023-07-19 at 5 05 22 PM