google-deepmind / reverb

Reverb is an efficient and easy-to-use data storage and transport system designed for machine learning research

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

print samples with high priority

unacao opened this issue · comments

commented

Hi, is there a way to output samples with high priority in a table for evaluation?

commented

Hi, not sure of what exactly you want to output. You can configure a table with priority sampling with a PrioritizedSelector as the sampler.

commented

Right now I am using the PrioritizedSelector for sampling. But once I update the priority of the samples, it is up to the selector which sample to draw. I would like to see a snapshot of the samples with the updated priority in the table to make sure it aligns with my expectation.

commented

Unfortunately, I don't think we support any way to log this data that doesn't require to fiddle with the Reverb code (and recompiling). Maybe a less involved way is to save the data to a Reverb checkpoint and inspect it manually...

commented

After saving the data to a checkpoint, what is the way to manually access it other than loading the checkpoint to restore a new reverb server?

commented

I don't think we have a common way of doing that other than restoring the checkpoint and inspecting it...

commented

I am trying to load the checkpoint, and it always stuck at the following without proceeding:

`[reverb/cc/platform/tfrecord_checkpointer.cc:162] Initializing TFRecordCheckpointer in /root/workspace/reverb-test/2023-04-05T23:59:55.652457336+08:00.

[reverb/cc/platform/tfrecord_checkpointer.cc:567] Loading latest checkpoint from /root/workspace/reverb-test/2023-04-05T23:59:55.652457336+08:00

[reverb/cc/platform/default/server.cc:71] Started replay server on port 8866

server info: {'training_table': TableInfo(name='training_table', sampler_options=prioritized {
priority_exponent: 0.8
}
, remover_options=fifo: true
is_deterministic: true
, max_size=460643, max_times_sampled=460643, rate_limiter_info=samples_per_insert: 1.0
min_diff: -1.7976931348623157e+308
max_diff: 1.7976931348623157e+308
min_size_to_sample: 1
insert_stats {
}
sample_stats {
}
, signature={'timestamp': TensorSpec(shape=(), dtype=tf.int64, name=None), 'num_points_in_gt': TensorSpec(shape=, dtype=tf.int32, name=None), 'gt_boxes': TensorSpec(shape=, dtype=tf.float64, name=None), 'lidar_path': TensorSpec(shape=(), dtype=tf.string, name=None), 'gt_names': TensorSpec(shape=, dtype=tf.string, name=None)}, current_size=0, num_episodes=0, num_deleted_episodes=0, num_unique_samples=0, table_worker_time=sleeping_ms: 17
)}

2023-04-22 03:45:51.717907: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F AVX512_VNNI FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.

2023-04-22 03:45:52.625274: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1613] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 38402 MB memory: -> device: 0, name: NVIDIA A100-SXM4-40GB, pci bus id: 0000:6f:00.0, compute capability: 8.0

[reverb/cc/client.cc:165] Sampler and server are owned by the same process (1685) so Table training_table is accessed directly without gRPC.`

What could be wrong?