tensorflow / profiler

A profiling and performance analysis tool for TensorFlow

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

100% device utilization

qo4on opened this issue · comments

commented

You showed a screenshot with near 100% GPU utilization in your article:

image

Can you share a code that you used to get this? I've read all your guides and tutorials but did not find any end to end example. The best result I managed to get was around 30% TPU load using cache() and prefetch().

Hi,

Unfortunately, I don't have the code that resulted in this trace. And even I do, it wouldn't be very helpful because tuning needs to be done for your own model. Have you looked at:
https://www.tensorflow.org/guide/gpu_performance_analysis

commented

Hi, thanks for your answer.

Unfortunately, this guide is written in an abstract way without any concrete end-to-end example.

tuning needs to be done for your own model

There are a lot of methods for performance tuning especially for distributed training on TPU and GPU and it's not clear which of them I should use for the model I have. Some notebooks of well optimized models would be much more helpful than all these "talk about" style articles. I do not talk about this particular trace. I'm looking for the code of any trace with near 100% TPU utilization.

commented