Tracing Operations Only
NoraAl opened this issue · comments
Hi there,
I am using tf.profiler.experimental.Trace
to compare the performance of a new operation to an existing one. However, kernel launch is taking very long and affecting the result.
I tried to make everything deterministic in each run, meaning the same data with no shuffling, same model with the same weights and no dropouts, same everything. However, the same operation, say mul, takes different amounts of time in each identical run, although it has the same number of occurrences.
I tried two different AWS instances, kernel launch became less significant. Still, the same operation with the same number of occurrences takes different amounts of time.
What am I missing?
Thanks