microsoft / nn-Meter

A DNN inference latency prediction toolkit for accurately modeling and predicting the latency on diverse edge devices.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Benchmark_model provided seems ineffective on gpu

ireneMsm2020 opened this issue · comments

commented

Hi,
when I profile the gpu latency on Snapdragon888+ with tf2.7 benchmark_model you provided, the latency seems to be always zero. Is there any idea?

Thank you in advance!

Hi, it seems there is something wrong in profiling. Maybe you could debug the benchmark model by running command like this:

# push the model to device
adb [-s <device-serial>] push <path-of-your-model> <remote-model-path-to-push>

# run the benchmark model
adb [-s <device-serial>] shell <path-of-your-benchmark-model> --num_threads=1 --num_runs=50 --warmup_runs=10 --graph=<remote-model-path> --enable_op_profiling=true --use_gpu=false

if the benchmark model works well, there will be messages containing latency of each node, and summary message like this:

Timings (microseconds): count=222 first=3897 curr=3924 min=3858 max=4031 avg=3925.67 std=29
Memory (bytes): count=0
133 nodes observed

我遇到了相同的问题,同时我进行了测试,可以得到如下的信息:Timings (microseconds): count=100 first=913862 curr=914318 min=877044 max=926036 avg=911992 std=8223
Memory (bytes): count=0
1 nodes observed,但是在正则化匹配时匹配失败 @ @JiahangXu

我在dev/profile-in-local上找到了解决方案,谢谢