Total driver(callback) events reaches max
ydennisy opened this issue · comments
Dennis commented
Hi,
I am seeing the following message:
Detected GPU events dropped on dennis-notebook: Profiler has collected 2097153 driver events and 2097153 device events. 6317194 events dropped because total device(activity) events reaches max; 6314455 events dropped because total driver(callback) events reaches max.
Yet I am training for a single epoch and profiling just for a few batches:
model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=LEARNING_RATE))
log_dir = f'logs/{datetime.datetime.now().strftime("%Y%m%d-%H%M%S")}_b{BATCH}_e{EMBED_DIMS}_labels'
tboard_cb = tf.keras.callbacks.TensorBoard(log_dir=log_dir, histogram_freq=0, profile_batch=(10,20))
history = model.fit(
cached_train,
epochs=1,
verbose=1,
validation_data=cached_test,
callbacks=[tboard_cb]
)
SohamBhattacharya commented
Hi @ydennisy, did you find any solution to this? I'm also facing the same issue.
Alex Li commented
any update?
Luka M commented
You are trying to profile batches 10-20, does your code even train for 20 batches?
Try adding steps_per_epoch=21 or setting profile_batch=1