tensorflow / profiler

A profiling and performance analysis tool for TensorFlow

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Total driver(callback) events reaches max

ydennisy opened this issue · comments

Hi,

I am seeing the following message:

Detected GPU events dropped on dennis-notebook: Profiler has collected 2097153 driver events and 2097153 device events. 6317194 events dropped because total device(activity) events reaches max; 6314455 events dropped because total driver(callback) events reaches max.

Yet I am training for a single epoch and profiling just for a few batches:

model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=LEARNING_RATE))

log_dir = f'logs/{datetime.datetime.now().strftime("%Y%m%d-%H%M%S")}_b{BATCH}_e{EMBED_DIMS}_labels'
tboard_cb = tf.keras.callbacks.TensorBoard(log_dir=log_dir, histogram_freq=0, profile_batch=(10,20))

history = model.fit(
    cached_train,
    epochs=1, 
    verbose=1,
    validation_data=cached_test,
    callbacks=[tboard_cb]
)

Hi @ydennisy, did you find any solution to this? I'm also facing the same issue.

any update?

commented

You are trying to profile batches 10-20, does your code even train for 20 batches?

Try adding steps_per_epoch=21 or setting profile_batch=1