TPU profiler not working for pytorch XLA
fostiropoulos opened this issue · comments
As reported: pytorch/xla#2493
To get TPU Profiler to work with torch-xla you need to add the flag --workers_list=''
In detail:
# Terminal 1: run your training job
# Terminal 2: it will capture 10000 ms profile
capture_tpu_profile --tpu=${TPU_NAME} --logdir=${MODEL_DIR} --num_tracing_attempts=10 --duration_ms=10000 --workers_list=''
# Terminal 3:
tensorboard --logdir=${MODEL_DIR}