Problem between conv2d and dense layer
galbiati opened this issue · comments
Hi,
I'm having an issue where the output from a convolutional layer is not read properly into a dense layer. The output shape for conv4
below has the expected dimensions (batch_size, 128, 56, 76)
, but the shape of output received by DenseLayer is evidently (batch_size, 14400)
, and the size of parameters inferred by DenseLayer is 417792 instead of 544768 as expected by prod(128, 56, 76)
.
In fact, when varying the number of filters and filter size at conv4
, or the conv layer passed to dense, the size of output received by DenseLayer remains 14400 no matter what (the inferred size of W does change, however).
Oddly, other code that I've written previously that uses the same objects in this way doesn't have a problem. I'm on OS X, Lasagne 0.2.dev1, theano 0.10.0dev1.dev-2c6adcccaefb3aa466be76f403bcc4981d023a6a. I haven't included the full trace in case this is just me being dumb/blind, but happy to provide if needed.
def build_encoder(input_var=None):
input_shape = (None, 3, 60, 80)
input_layer = L.InputLayer(input_shape, input_var=input_var)
input_layer_shape = input_layer.get_output_shape_for(input_var)
print(input_layer.get_output_shape_for(X.shape))
conv1 = L.Conv2DLayer(
input_layer,
num_filters=32, filter_size=3,
nonlinearity=nl.selu
)
print(conv1.get_output_shape_for(X.shape))
conv2 = L.Conv2DLayer(
conv1,
num_filters=32, filter_size=3,
nonlinearity=nl.selu
)
print(conv2.get_output_shape_for(X.shape))
conv3 = L.Conv2DLayer(
conv2,
num_filters=64, filter_size=5,
nonlinearity=nl.selu
)
print(conv3.get_output_shape_for(X.shape))
conv4 = L.Conv2DLayer(
conv3,
num_filters=128, filter_size=5,
nonlinearity=nl.selu
)
conv4_shape = conv4.get_output_shape_for(X.shape)
print(conv4_shape)
dense = L.DenseLayer(
conv4,
num_units=512,
nonlinearity=nl.selu, W=lasagne.init.HeUniform()
)
print([p.shape for p in L.get_all_param_values(dense)])
return dense
encoder_input = T.ftensor4('inputs')
encoder = build_encoder()
encoder_func = theano.function([encoder_input], encoder.get_output_for(encoder_input))
encoder_func(X) # X.shape = (2900, 3, 60, 80); X.dtype = np.float32
ValueError: shapes (2900,14400) and (417792,512) not aligned: 14400 (dim 1) != 417792 (dim 0)
Apply node that caused the error: dot(Reshape{2}.0, W)
Toposort index: 4
Inputs types: [TensorType(float32, matrix), TensorType(float64, matrix)]
Inputs shapes: [(2900, 14400), (417792, 512)]
Inputs strides: [(57600, 4), (4096, 8)]
Inputs values: ['not shown', 'not shown']
Outputs clients: [[Elemwise{Composite{(i0 * Switch(GT((i1 + i2), i3), (i1 + i2), (i4 * expm1((i1 + i2)))))}}[(0, 1)](TensorConstant{(1, 1) of ..5070098736}, dot.0, InplaceDimShuffle{x,0}.0, TensorConstant{(1, 1) of 0.0}, TensorConstant{(1, 1) of ..7326324235})]]
You create the network with an InputLayer
of shape (None, 3, 60, 80)
. This is passed on from layer to layer. The DenseLayer
will then expect an input of conv4.output_shape
, not of conv4.get_output_shape_for(X.shape)
.
Note that conv4.get_output_shape_for(X.shape)
computes the shape for passing X
through conv4
only. If you want to pass it through the full network, you need lasagne.layers.get_output_shape(conv4, X.shape)
. But this doesn't help against the fact that input_shape
must match what you feed to the network later, because otherwise it won't set up the weight matrix in the first dense layer correctly.
Similarly, encoder.get_output_for(encoder_input)
will pass encoder_input
through the very last layer of the encoder. You want lasagne.layers.get_output(encoder, encoder_input)
instead. You can also use build_encoder(encoder_input)
and then lasagne.layers.get_output(encoder)
. See #104 for why we turned this into a global function rather than a Layer method.
In future, please post usage questions on our mailing list instead. This issue tracker is meant for bug reports and feature discussions only. Thank you!