Benchmark_model provided seems ineffective on gpu

Question

Benchmark_model provided seems ineffective on gpu

ireneMsm2020 opened this issue 2 years ago · comments

Hi,
when I profile the gpu latency on Snapdragon888+ with tf2.7 benchmark_model you provided, the latency seems to be always zero. Is there any idea?

Thank you in advance!

Jiahang Xu · Answer 1 · Fri Aug 19 2022 20:46:05 GMT+0800 (China Standard Time)

Hi, it seems there is something wrong in profiling. Maybe you could debug the benchmark model by running command like this:

# push the model to device
adb [-s <device-serial>] push <path-of-your-model> <remote-model-path-to-push>

# run the benchmark model
adb [-s <device-serial>] shell <path-of-your-benchmark-model> --num_threads=1 --num_runs=50 --warmup_runs=10 --graph=<remote-model-path> --enable_op_profiling=true --use_gpu=false

if the benchmark model works well, there will be messages containing latency of each node, and summary message like this:

Timings (microseconds): count=222 first=3897 curr=3924 min=3858 max=4031 avg=3925.67 std=29
Memory (bytes): count=0
133 nodes observed

lorena527 · Answer 2 · Mon Jun 05 2023 15:53:24 GMT+0800 (China Standard Time)

我遇到了相同的问题，同时我进行了测试，可以得到如下的信息：Timings (microseconds): count=100 first=913862 curr=914318 min=877044 max=926036 avg=911992 std=8223
Memory (bytes): count=0
1 nodes observed，但是在正则化匹配时匹配失败 @ @JiahangXu

lorena527 · Answer 3 · Tue Jun 06 2023 11:59:34 GMT+0800 (China Standard Time)

我在dev/profile-in-local上找到了解决方案，谢谢