Package NN: Batches and Cuda

Question

Package NN: Batches and Cuda

markpeot opened this issue 11 years ago · comments

Clement Farabet writes:

To get awesome speedup you need to use the modules named ***CUDA
(SpatialConvolutionCUDA, ...), and batch sizes multiples of 32. 32 is usually fine.
Then the problem is that these modules assume that the batch is the innermost
dimension in the state tensors. So it's a bit involved, but that gives you 5 to 7x
speedup over the regular CUDA code.

I am thinking how I might be able to use awesome ...CUDA acceleration in convolutional networks and all of my solutions are crufty.

So, I can build a model similar to the tutorial in 2_supervised up to layer 4.

   model = nn.Sequential()
   model:add(nn.SpatialConvolutionCUDA(nfeats, layers, filtsize, filtsize))
   model:add(nn.Tanh())
   model:add(nn.SpatialMaxPoolingCUDA(poolsize,poolsize,poolsize,poolsize))

...now I am in trouble, the output of layer 4 is LAYERS x V x H x BATCH, but
the natural next layer (Subtractive normalization) wants something that is
(LAYERS x BATCH x V x H).

One awkward way to continue this model is to do this:

   model:add(nn.Transpose(3, 4))
   model:add(nn.Transpose(2, 3))
   model:add(nn.Reshape(batchSize * layers, poolV, poolH))  
   model:add(nn.SpatialSubtractiveNormalization(batchSize * layers, normkernel))

...but this is awful. I could rewrite SpatialSubtractiveNormalization, but this is just the tip of the iceberg of what I would have to rewrite (NLLcriterion, etc).

So...

What is the recommended path for integrating ***CUDA functions into models?
It seems to me that minibatches are pretty central to NN learning. Perhaps ALL routines in the NN package should accept an innermost mini-batch dimension?

Mark

Soumith Chintala · Answer 1 · Fri Nov 01 2013 04:14:12 GMT+0800 (China Standard Time)

you could use a single nn.Transpose module, before the *CUDA module and a single one after the *CUDA module.
nn.Transpose takes multiple transpose inputs
https://github.com/torch/torch7-distro/blob/master/extra/nn/Transpose.lua

You wouldn't need a reshape. Almost all the modules in the nn package support minibatches.

Soumith Chintala · Answer 2 · Fri Nov 01 2013 04:17:34 GMT+0800 (China Standard Time)

in the nn package, except for the *CUDA modules, all other modules accept an input of
(batchSize, input), where input can be a 3D tensor for example. (with batch-size, it'll be a 4D tensor, with the first dimension being batchSize)

For *CUDA modules, the only difference is that the batchSize is the last dimension.

Clement Farabet · Answer 3 · Fri Nov 01 2013 04:35:59 GMT+0800 (China Standard Time)

So it's basically how it looks:

model:add(nn.Transpose({1,4},{1,3},{1,2}))
model:add(nn.SpatialConvolutionCUDA(...))
...
model:add(nn.Transpose({4,1},{4,2},{4,3}))

The first layer moves the batch dim from 1 to 4; the last one does the opposite. If you have 3 conv layers, interspersed with max pooling, you only need to transpose at the beginning, and after the last conv layer. After that you can reshape and hit your linear layers.

As Soumith said, pretty much all the standard modules support batches, but before GPUs kicked in, it was more natural to have the batch indexed as the first dim.

Mark Peot · Answer 4 · Fri Nov 01 2013 05:15:21 GMT+0800 (China Standard Time)

With formatting:

function toCUDA()
return nn.Transpose({1,4},{1,3},{1,2})
end

Etc