About the benchmark device
austingg opened this issue · comments
hi @yonghenglh6 ,
which gpu do you use for the benchmark time of depthwise conv?
GeForce GTX 1080
@yonghenglh6 what's your cudnn version? I use GTX1080 cudnn v5.1, the example net costs about 7ms for forward pass and 10 ms for backward pass (take bn into consideration).
Beside, the example network prototxt with its' name *** 128 *** , however its' input is 224, and on 224 case , the last avg pooling layer's kernel size should be 7 instead 4.
You are right at all.
I mismatch the performance with the my half mobilenet. I will fix it. Thanks
@austingg
It is fixed now. The speed-up performance is less attractive.
@yonghenglh6 doesn't matter. We can make it faster step by step. And Now it is indeed faster than Depthwise with group.