tracel-ai / burn

Burn is a new comprehensive dynamic Deep Learning Framework built using Rust with extreme flexibility, compute efficiency and portability as its primary goals.

Home Page:https://burn.dev

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

CubeCL: add a guide on how to profile CubeCL kernels

mepatrick73 opened this issue · comments

There should be a CubeCL guide on how to profile kernels. As an initial step, we should write a guide on profiling CubeCL by compiling it into CUDA kernels and using Nvidia's Nsight tools on Linux.