Problem about the running time.

Question

Problem about the running time.

tjulyz opened this issue 7 years ago · comments

Hi!
Thanks for your kind sharing! There is a problem when I running your code for Cifar10 classification. That is, when I change the kernel size of the convolutional layers in each block to 1x1 (from 3x3 to 1x1), the running time is about 4.11s for each epoch (from 3.05s to 4.11s) on Titan X. However, 3x3 convolution always consumes much computional resources than 1x1 convolution. So I am confused. Can you help analyze whether there is a problem in your code or in the tensorflow optimization?
Thanks again!

Illarion · Answer 1 · Thu Jan 04 2018 17:03:38 GMT+0800 (China Standard Time)

Hi!
It's really strange behaviour. I've examined the code and I haven't find any mistakes. Of course it can highly depends on CUDA convolution and paralelization implementation by itself. You may print all existed shapes in the network with 3x3 kernels and 1x1 kernels, and after create dummy variables with tensorflow and with help of python timeit module just mesure execution time of the each component. Maybe this will point you in the right direction.

lyz · Answer 2 · Mon Jan 08 2018 15:47:50 GMT+0800 (China Standard Time)

Thanks for your advice. I will try it again.