Problem between conv2d and dense layer

Question

Problem between conv2d and dense layer

galbiati opened this issue 7 years ago · comments

Hi,

I'm having an issue where the output from a convolutional layer is not read properly into a dense layer. The output shape for conv4 below has the expected dimensions (batch_size, 128, 56, 76), but the shape of output received by DenseLayer is evidently (batch_size, 14400), and the size of parameters inferred by DenseLayer is 417792 instead of 544768 as expected by prod(128, 56, 76).

In fact, when varying the number of filters and filter size at conv4, or the conv layer passed to dense, the size of output received by DenseLayer remains 14400 no matter what (the inferred size of W does change, however).

Oddly, other code that I've written previously that uses the same objects in this way doesn't have a problem. I'm on OS X, Lasagne 0.2.dev1, theano 0.10.0dev1.dev-2c6adcccaefb3aa466be76f403bcc4981d023a6a. I haven't included the full trace in case this is just me being dumb/blind, but happy to provide if needed.

def build_encoder(input_var=None):
    input_shape = (None, 3, 60, 80)
    input_layer = L.InputLayer(input_shape, input_var=input_var)
    input_layer_shape = input_layer.get_output_shape_for(input_var)
    
    print(input_layer.get_output_shape_for(X.shape))
    
    conv1 = L.Conv2DLayer(
        input_layer, 
        num_filters=32, filter_size=3,
        nonlinearity=nl.selu
    )
    
    print(conv1.get_output_shape_for(X.shape))
    
    conv2 = L.Conv2DLayer(
        conv1,
        num_filters=32, filter_size=3,
        nonlinearity=nl.selu
    )
    
    print(conv2.get_output_shape_for(X.shape))
    
    conv3 = L.Conv2DLayer(
        conv2,
        num_filters=64, filter_size=5,
        nonlinearity=nl.selu
    )
    
    print(conv3.get_output_shape_for(X.shape))
    
    conv4 = L.Conv2DLayer(
        conv3,
        num_filters=128, filter_size=5,
        nonlinearity=nl.selu
    )
    
    conv4_shape = conv4.get_output_shape_for(X.shape)
    print(conv4_shape)
    
    dense = L.DenseLayer(
        conv4, 
        num_units=512, 
        nonlinearity=nl.selu, W=lasagne.init.HeUniform()
    )
    
    
    print([p.shape for p in L.get_all_param_values(dense)])

    return dense

encoder_input = T.ftensor4('inputs')
encoder = build_encoder()
encoder_func = theano.function([encoder_input], encoder.get_output_for(encoder_input))

encoder_func(X) # X.shape = (2900, 3, 60, 80); X.dtype = np.float32

ValueError: shapes (2900,14400) and (417792,512) not aligned: 14400 (dim 1) != 417792 (dim 0)
Apply node that caused the error: dot(Reshape{2}.0, W)
Toposort index: 4
Inputs types: [TensorType(float32, matrix), TensorType(float64, matrix)]
Inputs shapes: [(2900, 14400), (417792, 512)]
Inputs strides: [(57600, 4), (4096, 8)]
Inputs values: ['not shown', 'not shown']
Outputs clients: [[Elemwise{Composite{(i0 * Switch(GT((i1 + i2), i3), (i1 + i2), (i4 * expm1((i1 + i2)))))}}[(0, 1)](TensorConstant{(1, 1) of ..5070098736}, dot.0, InplaceDimShuffle{x,0}.0, TensorConstant{(1, 1) of 0.0}, TensorConstant{(1, 1) of ..7326324235})]]

Jan Schlüter · Answer 1 · Mon Jul 31 2017 18:48:45 GMT+0800 (China Standard Time)

You create the network with an InputLayer of shape (None, 3, 60, 80). This is passed on from layer to layer. The DenseLayer will then expect an input of conv4.output_shape, not of conv4.get_output_shape_for(X.shape).

Note that conv4.get_output_shape_for(X.shape) computes the shape for passing X through conv4 only. If you want to pass it through the full network, you need lasagne.layers.get_output_shape(conv4, X.shape). But this doesn't help against the fact that input_shape must match what you feed to the network later, because otherwise it won't set up the weight matrix in the first dense layer correctly.

Similarly, encoder.get_output_for(encoder_input) will pass encoder_input through the very last layer of the encoder. You want lasagne.layers.get_output(encoder, encoder_input) instead. You can also use build_encoder(encoder_input) and then lasagne.layers.get_output(encoder). See #104 for why we turned this into a global function rather than a Layer method.

In future, please post usage questions on our mailing list instead. This issue tracker is meant for bug reports and feature discussions only. Thank you!