yonghenglh6 / DepthwiseConvolution

A personal depthwise convolution layer implementation on caffe by liuhao.(only GPU)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

About the benchmark device

austingg opened this issue · comments

hi @yonghenglh6 ,
which gpu do you use for the benchmark time of depthwise conv?

GeForce GTX 1080

@yonghenglh6 what's your cudnn version? I use GTX1080 cudnn v5.1, the example net costs about 7ms for forward pass and 10 ms for backward pass (take bn into consideration).

Beside, the example network prototxt with its' name *** 128 *** , however its' input is 224, and on 224 case , the last avg pooling layer's kernel size should be 7 instead 4.

You are right at all.
I mismatch the performance with the my half mobilenet. I will fix it. Thanks

@austingg
It is fixed now. The speed-up performance is less attractive.

@yonghenglh6 doesn't matter. We can make it faster step by step. And Now it is indeed faster than Depthwise with group.