Cannot use FireModule with CUDA

Question

Cannot use FireModule with CUDA

jonathanasdf opened this issue 8 years ago · comments

gradParameters becomes nan.

Of course, if I just use the constituent layers and build my own FireModule function like:

function FireModule(nInputPlane, s1x1, e1x1, e3x3)
  local module = nn.Sequential()
  module:add(nn.SpatialConvolution(nInputPlane, s1x1, 1, 1)):add(nn.ReLU(true))
  local expand = nn.Concat(2)
  expand:add(nn.SpatialConvolution(s1x1, e1x1, 1, 1))
  expand:add(nn.SpatialConvolution(s1x1, e3x3, 3, 3, 1, 1, 1, 1))
  module:add(expand):add(nn.ReLU(true))
  return module
end
model:add(FireModule(a,b,c,d))

then everything works fine, but then I don't get the neat __tostring__.

Is there any way of fixing this from your side? Thanks

Nicholas Léonard · Answer 1 · Sat Mar 12 2016 04:37:21 GMT+0800 (China Standard Time)

@sagarwaghmare69 Could you look into this when you have time?

Sagar M Waghmare · Answer 2 · Sat Mar 12 2016 06:05:45 GMT+0800 (China Standard Time)

@jonathanasdf I am not able to reproduce the error. Can you please provide your code snippet ?

Sagar M Waghmare · Answer 3 · Sat Mar 12 2016 06:08:09 GMT+0800 (China Standard Time)

Yes.

On Fri, Mar 11, 2016 at 3:37 PM, Nicholas Léonard notifications@github.com
wrote:

@sagarwaghmare69 https://github.com/sagarwaghmare69 Could you look into
this when you have time?

—
Reply to this email directly or view it on GitHub
#45 (comment)
.

Jonathan Shen · Answer 4 · Sat Mar 12 2016 20:47:47 GMT+0800 (China Standard Time)

Apparently it randomly becomes 0, inf, or nan. Also, zeroGradParameters doesn't seem to do anything.

require 'dpnn'
require 'cutorch'

local m = nn.Sequential()
m:add(nn.FireModule(1,1,1,1))
_, p = m:getParameters()
print(p:sum())

m = m:cuda()
_, p = m:getParameters()
print(p:sum())

m:zeroGradParameters()
print(p:sum())

Some outputs (randomly one of these)

-1.5261873854701e+236
nan
nan

1.117789947961e+138
inf
inf

4.8518308672707e-309
0
0

Sagar M Waghmare · Answer 5 · Sat Mar 12 2016 21:10:52 GMT+0800 (China Standard Time)

@jonathanasdf Thanks for reporting the bug and providing an example. I was able to reproduce the bug. We will fix it (latest by Monday evening). Thanks.

Nicholas Léonard · Answer 6 · Wed Mar 16 2016 03:13:40 GMT+0800 (China Standard Time)

@jonathanasdf fixed via #46 . Thanks @sagarwaghmare69

Jonathan Shen · Answer 7 · Wed Mar 16 2016 03:23:45 GMT+0800 (China Standard Time)

Thank you for the fast fix!