yanpanlau / Keras-FlappyBird

Using Keras and Deep Q-Network to Play FlappyBird

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

CorrMM images and kernel must have the same stack size

marionleborgne opened this issue · comments

Hi there! Thanks for the great blog post. I ran your code with theano as the backend and I'm getting the stacktrace below. Any idea? It says it loaded the model but that there is an issue with the images and kernel needing to have the same stack size.

Also, are you running it with theano or tensorflow as the backend?

Thanks!

Marion

python qlearn.py -m "Run"
Recommended matplotlib backend is `Agg` for full skimage.viewer functionality.
libpng warning: iCCP: known incorrect sRGB profile
libpng warning: iCCP: known incorrect sRGB profile
libpng warning: iCCP: known incorrect sRGB profile
libpng warning: iCCP: known incorrect sRGB profile
libpng warning: iCCP: known incorrect sRGB profile
libpng warning: iCCP: known incorrect sRGB profile
libpng warning: iCCP: known incorrect sRGB profile
libpng warning: iCCP: known incorrect sRGB profile
libpng warning: iCCP: known incorrect sRGB profile
libpng warning: iCCP: known incorrect sRGB profile
libpng warning: iCCP: known incorrect sRGB profile
libpng warning: iCCP: known incorrect sRGB profile
libpng warning: iCCP: known incorrect sRGB profile
libpng warning: iCCP: known incorrect sRGB profile
libpng warning: iCCP: known incorrect sRGB profile
Using Theano backend.
Now we build the model
We finish building the model
Now we load weight
Weight load successfully
Traceback (most recent call last):
  File "qlearn.py", line 198, in <module>
    main()
  File "qlearn.py", line 195, in main
    playGame(args)
  File "qlearn.py", line 189, in playGame
    trainNetwork(model,args)
  File "qlearn.py", line 107, in trainNetwork
    q = model.predict(s_t)       #input a stack of 4 images, get the prediction
  File "/Users/mleborgne/Library/Python/2.7/lib/python/site-packages/keras/models.py", line 671, in predict
    return self.model.predict(x, batch_size=batch_size, verbose=verbose)
  File "/Users/mleborgne/Library/Python/2.7/lib/python/site-packages/keras/engine/training.py", line 1179, in predict
    batch_size=batch_size, verbose=verbose)
  File "/Users/mleborgne/Library/Python/2.7/lib/python/site-packages/keras/engine/training.py", line 878, in _predict_loop
    batch_outs = f(ins_batch)
  File "/Users/mleborgne/Library/Python/2.7/lib/python/site-packages/keras/backend/theano_backend.py", line 717, in __call__
    return self.function(*inputs)
  File "/Users/mleborgne/Library/Python/2.7/lib/python/site-packages/theano/compile/function_module.py", line 871, in __call__
    storage_map=getattr(self.fn, 'storage_map', None))
  File "/Users/mleborgne/Library/Python/2.7/lib/python/site-packages/theano/gof/link.py", line 314, in raise_with_op
    reraise(exc_type, exc_value, exc_trace)
  File "/Users/mleborgne/Library/Python/2.7/lib/python/site-packages/theano/compile/function_module.py", line 859, in __call__
    outputs = self.fn()
ValueError: CorrMM images and kernel must have the same stack size

Apply node that caused the error: CorrMM{half, (4, 4)}(InplaceDimShuffle{0,3,1,2}.0, Subtensor{::, ::, ::int64, ::int64}.0)
Toposort index: 22
Inputs types: [TensorType(float32, 4D), TensorType(float32, 4D)]
Inputs shapes: [(1, 80, 4, 80), (8, 8, 32, 4)]
Inputs strides: [(320, 4, 25600, 320), (4, 32, -1024, -256)]
Inputs values: ['not shown', 'not shown']
Outputs clients: [[Subtensor{int64:int64:int8, int64:int64:int8, int64:int64:int8, int64:int64:int8}(CorrMM{half, (4, 4)}.0, ScalarFromTensor.0, ScalarFromTensor.0, Constant{1}, Constant{0}, Constant{32}, Constant{1}, ScalarFromTensor.0, ScalarFromTensor.0, Constant{1}, ScalarFromTensor.0, ScalarFromTensor.0, Constant{1})]]

Backtrace when the node is created(use Theano flag traceback.limit=N to make it longer):
  File "qlearn.py", line 44, in buildmodel
    model.add(Convolution2D(32, 8, 8, subsample=(4,4),init=lambda shape, name: normal(shape, scale=0.01, name=name), border_mode='same',input_shape=(img_channels,img_rows,img_cols)))
  File "/Users/mleborgne/Library/Python/2.7/lib/python/site-packages/keras/models.py", line 276, in add
    layer.create_input_layer(batch_input_shape, input_dtype)
  File "/Users/mleborgne/Library/Python/2.7/lib/python/site-packages/keras/engine/topology.py", line 370, in create_input_layer
    self(x)
  File "/Users/mleborgne/Library/Python/2.7/lib/python/site-packages/keras/engine/topology.py", line 514, in __call__
    self.add_inbound_node(inbound_layers, node_indices, tensor_indices)
  File "/Users/mleborgne/Library/Python/2.7/lib/python/site-packages/keras/engine/topology.py", line 572, in add_inbound_node
    Node.create_node(self, inbound_layers, node_indices, tensor_indices)
  File "/Users/mleborgne/Library/Python/2.7/lib/python/site-packages/keras/engine/topology.py", line 149, in create_node
    output_tensors = to_list(outbound_layer.call(input_tensors[0], mask=input_masks[0]))
  File "/Users/mleborgne/Library/Python/2.7/lib/python/site-packages/keras/layers/convolutional.py", line 466, in call
    filter_shape=self.W_shape)
  File "/Users/mleborgne/Library/Python/2.7/lib/python/site-packages/keras/backend/theano_backend.py", line 1135, in conv2d
    filter_shape=filter_shape)

Follow up: even with the backend set as tensorflow (CPU install) I get the same issue with the dimensions:

python qlearn.py -m "Run"
Recommended matplotlib backend is `Agg` for full skimage.viewer functionality.
libpng warning: iCCP: known incorrect sRGB profile
libpng warning: iCCP: known incorrect sRGB profile
libpng warning: iCCP: known incorrect sRGB profile
libpng warning: iCCP: known incorrect sRGB profile
libpng warning: iCCP: known incorrect sRGB profile
libpng warning: iCCP: known incorrect sRGB profile
libpng warning: iCCP: known incorrect sRGB profile
libpng warning: iCCP: known incorrect sRGB profile
libpng warning: iCCP: known incorrect sRGB profile
libpng warning: iCCP: known incorrect sRGB profile
libpng warning: iCCP: known incorrect sRGB profile
libpng warning: iCCP: known incorrect sRGB profile
libpng warning: iCCP: known incorrect sRGB profile
libpng warning: iCCP: known incorrect sRGB profile
libpng warning: iCCP: known incorrect sRGB profile
Using TensorFlow backend.
Now we build the model
We finish building the model
Now we load weight
Traceback (most recent call last):
  File "qlearn.py", line 198, in <module>
    main()
  File "qlearn.py", line 195, in main
    playGame(args)
  File "qlearn.py", line 189, in playGame
    trainNetwork(model,args)
  File "qlearn.py", line 85, in trainNetwork
    model.load_weights("model.h5")
  File "/Users/mleborgne/Library/Python/2.7/lib/python/site-packages/keras/engine/topology.py", line 2500, in load_weights
    self.load_weights_from_hdf5_group(f)
  File "/Users/mleborgne/Library/Python/2.7/lib/python/site-packages/keras/engine/topology.py", line 2585, in load_weights_from_hdf5_group
    K.batch_set_value(weight_value_tuples)
  File "/Users/mleborgne/Library/Python/2.7/lib/python/site-packages/keras/backend/tensorflow_backend.py", line 963, in batch_set_value
    assign_op = x.assign(assign_placeholder)
  File "/Users/mleborgne/Library/Python/2.7/lib/python/site-packages/tensorflow/python/ops/variables.py", line 505, in assign
    return state_ops.assign(self._variable, value, use_locking=use_locking)
  File "/Users/mleborgne/Library/Python/2.7/lib/python/site-packages/tensorflow/python/ops/gen_state_ops.py", line 45, in assign
    use_locking=use_locking, name=name)
  File "/Users/mleborgne/Library/Python/2.7/lib/python/site-packages/tensorflow/python/framework/op_def_library.py", line 749, in apply_op
    op_def=op_def)
  File "/Users/mleborgne/Library/Python/2.7/lib/python/site-packages/tensorflow/python/framework/ops.py", line 2382, in create_op
    set_shapes_for_outputs(ret)
  File "/Users/mleborgne/Library/Python/2.7/lib/python/site-packages/tensorflow/python/framework/ops.py", line 1783, in set_shapes_for_outputs
    shapes = shape_func(op)
  File "/Users/mleborgne/Library/Python/2.7/lib/python/site-packages/tensorflow/python/framework/common_shapes.py", line 596, in call_cpp_shape_fn
    raise ValueError(err.message)
ValueError: Dimension 0 in both shapes must be equal, but are 8 and 32

Ok so I found a way to fix this:

  1. Delete model.h5
  2. Run python qlearn.py -m "Train"
  3. Run python qlearn.py -m "Run". There's now a new model.h5 and it works.

However, I don't have access a machine powerful enough to run step 2 for enough time steps. So my model.h5 is really poorly trained, even after letting it train for a couple of hours. Could you train the network again with the code you have committed in you repo and then re-upload a fresh model.h5? I'm suspecting that is the source of the issue.

Thanks!

Hi Marion, can i check with you which version of the keras are you using?
The latest Keras has changed from theano to tensorflow therefore the
dim_ordering is now TF instead of TH. When i post the blog 3 months back i
am still using older version of Keras which is default to theano.

You can try to amend the CNN to default to dim_ordering to TH amd see if it
works

Cheers

2016年10月13日 07:45 於 "Marion Le Borgne" notifications@github.com 寫道:

Ok so I found a way to fix this:

  1. Delete model.h5
  2. Run python qlearn.py -m "Train"
  3. Run python qlearn.py -m "Run". There's now a new model.h5 and it works.

However, I don't have access a machine powerful enough to run step 2 for
enough time steps. So my model.h5 is really poorly trained, even after
letting it train for a couple of hours. Could you train the network again
with the code you have committed in you repo and then re-upload a fresh
model.h5? I'm suspecting that is the source of the issue.

Thanks!


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#3 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AO1sY63rZtqopcZ9jk5eVo0Mhcx_x-Ipks5qzXERgaJpZM4KVStq
.

Thanks @yanpanlau adding , dim_ordering="th" to the end of Convolution2D layer helped me.

@marionleborgne Nice to meet you here, i was following your projects with BCI and nupic, wish good like with Deep RL,

PS, guys, how do you find Comma-env environment for NLP RL form facebook?

@yanpanlau Hi 😄 I am using keras v 1.1.0. And you're right, adding dim_ordering='th' in the convolutional layer did the trick!

For others, if you are running Keras with a Theano backend, then edit qlearn.py and add dim_ordering='th' in the following lines:

model.add(Convolution2D(32, 8, 8, subsample=(4,4),init=lambda shape, name: normal(shape, scale=0.01, name=name), dim_ordering='th', borde\
r_mode='same',input_shape=(img_channels,img_rows,img_cols)))
    model.add(Activation('relu'))
    model.add(Convolution2D(64, 4, 4, subsample=(2,2),init=lambda shape, name: normal(shape, scale=0.01, name=name), dim_ordering='th', borde\
r_mode='same'))
    model.add(Activation('relu'))
    model.add(Convolution2D(64, 3, 3, subsample=(1,1),init=lambda shape, name: normal(shape, scale=0.01, name=name), dim_ordering='th', borde\
r_mode='same'))

@Timopheym nice to meet you ;)

Actually I take my comment in #3 (comment) back. Instead of editing @yanpanlau 's code, just edit the keras conf file.

If you happen to change the backend to theano, then just set image_dim_ordering to th.

  • ~/.keras/keras.json with theano backend:
{
    "epsilon": 1e-07,
    "floatx": "float32",
    "backend": "theano",
    "image_dim_ordering": "th"
}
  • ~/.keras/keras.json with tensorflow backend:
{
    "epsilon": 1e-07,
    "floatx": "float32",
    "backend": "tensorflow",
    "image_dim_ordering": "tf"
}

@yanpanlau I have a PR (#4) with an edit to your README if you'd like to add this to the instructions.

from keras.utils.layer_utils import convert_all_kernels_in_model
import keras.backend as K

judge whether the image_dim_ordering and backend are consistent,
then we can convert_all_kernels_in_model the model

Might be off topic but I will post this in thread. How long does it take for you guys to train before the bird starts to show signs of knowing how to play? I trained the script for overnight over 100k frames (my laptop has a bad GPU), but every time the bird goes up to the top of the screen and die hitting the first pipeline - so its not learning at all. It actually had better behavior in the beginning when it wasn't going up as much. I am just running @yanpanlau 's script directly (and added "image_dim_ordering": "tf" because I am using Tensorflow).

Also, how much speedup would you get with a GPU? On my laptop there is very little gain, if at all (maybe because minibatch size of 32 means that there are no advantages in using GPU)?

Hi, I have update the code with Keras and tensorflow backend. I did the test. The agent should able to learn after 400K frame. I also upload the model.h5 for you in case you don't want to train again

CODE-

from future import print_function

import os
import sys
import timeit

import numpy

import theano
import theano.tensor as T
from theano.tensor.signal import pool
from theano.tensor.nnet import conv2d

from logistic_sgd import LogisticRegression, load_data
from mlp import HiddenLayer

class LeNetConvPoolLayer(object):
"""Pool Layer of a convolutional network """

def __init__(self, rng, input, filter_shape, image_shape, poolsize=(2, 2)):
    """
    Allocate a LeNetConvPoolLayer with shared variable internal parameters.

    :type rng: numpy.random.RandomState
    :param rng: a random number generator used to initialize weights

    :type input: theano.tensor.dtensor4
    :param input: symbolic image tensor, of shape image_shape

    :type filter_shape: tuple or list of length 4
    :param filter_shape: (number of filters, num input feature maps,
                          filter height, filter width)

    :type image_shape: tuple or list of length 4
    :param image_shape: (batch size, num input feature maps,
                         image height, image width)

    :type poolsize: tuple or list of length 2
    :param poolsize: the downsampling (pooling) factor (#rows, #cols)
    """

    assert image_shape[1] == filter_shape[1]
    self.input = input

    # there are "num input feature maps * filter height * filter width"
    # inputs to each hidden unit
    fan_in = numpy.prod(filter_shape[1:])
    # each unit in the lower layer receives a gradient from:
    # "num output feature maps * filter height * filter width" /
    #   pooling size
    fan_out = (filter_shape[0] * numpy.prod(filter_shape[2:]) //
               numpy.prod(poolsize))
    # initialize weights with random weights
    W_bound = numpy.sqrt(6. / (fan_in + fan_out))
    self.W = theano.shared(
        numpy.asarray(
            rng.uniform(low=-W_bound, high=W_bound, size=filter_shape),
            dtype=theano.config.floatX
        ),
        borrow=True
    )

    # the bias is a 1D tensor -- one bias per output feature map
    b_values = numpy.zeros((filter_shape[0],), dtype=theano.config.floatX)
    self.b = theano.shared(value=b_values, borrow=True)

    # convolve input feature maps with filters
    conv_out = conv2d(
        input=input,
        filters=self.W,
        filter_shape=filter_shape,
        input_shape=image_shape
    )

    # pool each feature map individually, using maxpooling
    pooled_out = pool.pool_2d(
        input=conv_out,
        ds=poolsize,
        ignore_border=True
    )

    # add the bias term. Since the bias is a vector (1D array), we first
    # reshape it to a tensor of shape (1, n_filters, 1, 1). Each bias will
    # thus be broadcasted across mini-batches and feature map
    # width & height
    self.output = T.tanh(pooled_out + self.b.dimshuffle('x', 0, 'x', 'x'))

    # store parameters of this layer
    self.params = [self.W, self.b]

    # keep track of model input
    self.input = input

#Epoch size=10
#batch _size= 10
def evaluate_lenet5(learning_rate=0.1, n_epochs=1,
dataset='mnist.pkl.gz',
nkerns=[20, 50], batch_size=1):
""" Demonstrates lenet on MNIST dataset

:type learning_rate: float
:param learning_rate: learning rate used (factor for the stochastic
                      gradient)

:type n_epochs: int
:param n_epochs: maximal number of epochs to run the optimizer

:type dataset: string
:param dataset: path to the dataset used for training /testing (MNIST here)

:type nkerns: list of ints
:param nkerns: number of kernels on each layer
"""

rng = numpy.random.RandomState(23455)

datasets = load_data(dataset)

train_set_x, train_set_y = datasets[0]
valid_set_x, valid_set_y = datasets[1]
test_set_x, test_set_y = datasets[2]

# compute number of minibatches for training, validation and testing
n_train_batches = train_set_x.get_value(borrow=True).shape[0]
n_valid_batches = valid_set_x.get_value(borrow=True).shape[0]
n_test_batches = test_set_x.get_value(borrow=True).shape[0]
n_train_batches //= batch_size
n_valid_batches //= batch_size
n_test_batches //= batch_size

# allocate symbolic variables for the data
index = T.lscalar()  # index to a [mini]batch

# start-snippet-1
x = T.matrix('x')   # the data is presented as rasterized images
y = T.ivector('y')  # the labels are presented as 1D vector of
                    # [int] labels

######################
# BUILD ACTUAL MODEL #
######################
print('... building the model')

# Reshape matrix of rasterized images of shape (batch_size, 28 * 28)
# to a 4D tensor, compatible with our LeNetConvPoolLayer
# (28, 28) is the size of MNIST images.
layer0_input = x.reshape((batch_size, 1, 28, 28))

# Construct the first convolutional pooling layer:
# filtering reduces the image size to (28-5+1 , 28-5+1) = (24, 24)
# maxpooling reduces this further to (24/2, 24/2) = (12, 12)
# 4D output tensor is thus of shape (batch_size, nkerns[0], 12, 12)
layer0 = LeNetConvPoolLayer(
    rng,
    input=layer0_input,
    image_shape=(batch_size, 1, 28, 28),
    filter_shape=(nkerns[0], 1, 5, 5),
    poolsize=(2, 2)
    
)

# Construct the second convolutional pooling layer
# filtering reduces the image size to (12-5+1, 12-5+1) = (8, 8)
# maxpooling reduces this further to (8/2, 8/2) = (4, 4)
# 4D output tensor is thus of shape (batch_size, nkerns[1], 4, 4)
layer1 = LeNetConvPoolLayer(
    rng,
    input=layer0.output,
    image_shape=(batch_size, nkerns[0], 12, 12),
    filter_shape=(nkerns[1], nkerns[0], 5, 5),
    poolsize=(2, 2)
    
)

# The third convoultion network implemented
layer2 = LeNetConvPoolLayer(
    rng,
    input=layer1.output,
    image_shape=(batch_size, nkerns[0], 4, 4),
    filter_shape=(nkerns[1], nkerns[0], 3, 3)
    
)


# the HiddenLayer being fully-connected, it operates on 2D matrices of
# shape (batch_size, num_pixels) (i.e matrix of rasterized images).
# This will generate a matrix of shape (batch_size, nkerns[1] * 4 * 4),
# or (500, 50 * 4 * 4) = (500, 800) with the default values.
layer3_input = layer2.output.flatten(2)

# construct a fully-connected sigmoidal layer
layer3 = HiddenLayer(
    rng,
    input=layer3_input,
    n_in=nkerns[1] * 4 * 4,
    n_out=500,
    activation=T.nnet.relu
    #activation=T.tanh
)
   

# classify the values of the fully-connected sigmoidal layer
layer4 = LogisticRegression(input=layer3.output, n_in=500, n_out=10)

# the cost we minimize during training is the NLL of the model
cost = layer4.negative_log_likelihood(y)

# create a function to compute the mistakes that are made by the model
test_model = theano.function(
    [index],
    layer4.errors(y),
    givens={
        x: test_set_x[index * batch_size: (index + 1) * batch_size],
        y: test_set_y[index * batch_size: (index + 1) * batch_size]
    }
)

validate_model = theano.function(
    [index],
    layer4.errors(y),
    givens={
        x: valid_set_x[index * batch_size: (index + 1) * batch_size],
        y: valid_set_y[index * batch_size: (index + 1) * batch_size]
    }
)

# create a list of all model parameters to be fit by gradient descent
params = layer4.params + layer3.params + layer2.params + layer1.params + layer0.params

# create a list of gradients for all model parameters
grads = T.grad(cost, params)

# train_model is a function that updates the model parameters by
# SGD Since this model has many parameters, it would be tedious to
# manually create an update rule for each model parameter. We thus
# create the updates list by automatically looping over all
# (params[i], grads[i]) pairs.
updates = [
    (param_i, param_i - learning_rate * grad_i)
    for param_i, grad_i in zip(params, grads)
]

train_model = theano.function(
    [index],
    cost,
    updates=updates,
    givens={
        x: train_set_x[index * batch_size: (index + 1) * batch_size],
        y: train_set_y[index * batch_size: (index + 1) * batch_size]
    }
)
# end-snippet-1

###############
# TRAIN MODEL #
###############
print('... training')
# early-stopping parameters
patience = 10000  # look as this many examples regardless
patience_increase = 2  # wait this much longer when a new best is
                       # found
improvement_threshold = 0.995  # a relative improvement of this much is
                               # considered significant
validation_frequency = min(n_train_batches, patience // 2)
                              # go through this many
                              # minibatche before checking the network
                              # on the validation set; in this case we
                              # check every epoch

best_validation_loss = numpy.inf
best_iter = 0
test_score = 0.
start_time = timeit.default_timer()

epoch = 0
done_looping = False

while (epoch < n_epochs) and (not done_looping):
    epoch = epoch + 1
    for minibatch_index in range(n_train_batches):

        iter = (epoch - 1) * n_train_batches + minibatch_index

        if iter % 100 == 0:
            print('training @ iter = ', iter)
        cost_ij = train_model(minibatch_index)

        if (iter + 1) % validation_frequency == 0:

            # compute zero-one loss on validation set
            validation_losses = [validate_model(i) for i
                                 in range(n_valid_batches)]
            this_validation_loss = numpy.mean(validation_losses)
            print('epoch %i, minibatch %i/%i, validation error %f %%' %
                  (epoch, minibatch_index + 1, n_train_batches,
                   this_validation_loss * 100.))

            # if we got the best validation score until now
            if this_validation_loss < best_validation_loss:

                #improve patience if loss improvement is good enough
                if this_validation_loss < best_validation_loss *  \
                   improvement_threshold:
                    patience = max(patience, iter * patience_increase)

                # save best validation score and iteration number
                best_validation_loss = this_validation_loss
                best_iter = iter

                # test it on the test set
                test_losses = [
                    test_model(i)
                    for i in range(n_test_batches)
                ]
                test_score = numpy.mean(test_losses)
                print(('     epoch %i, minibatch %i/%i, test error of '
                       'best model %f %%') %
                      (epoch, minibatch_index + 1, n_train_batches,
                       test_score * 100.))

        if patience <= iter:
            done_looping = True
            break

end_time = timeit.default_timer()
print('Optimization complete.')
print('Best validation score of %f %% obtained at iteration %i, '
      'with test performance %f %%' %
      (best_validation_loss * 100., best_iter + 1, test_score * 100.))
print(('The code for file ' +
       os.path.dirname(os.path.realpath('__file__'))[1] +
       ' ran for %.2fm' % ((end_time - start_time) / 60.)), file=sys.stderr)

'''print(('The code for file ' +
       os.path.split(__file__)[1] +
       ' ran for %.2fm' % ((end_time - start_time) / 60.)), file=sys.stderr)'''

if name == 'main':
evaluate_lenet5()

def experiment(state, channel):
evaluate_lenet5(state.learning_rate, dataset=state.dataset)


ERROR

ValueError Traceback (most recent call last)
~/anaconda3/lib/python3.6/site-packages/theano/compile/function_module.py in call(self, *args, **kwargs)
902 outputs =
--> 903 self.fn() if output_subset is None else
904 self.fn(output_subset=output_subset)

ValueError: CorrMM images and kernel must have the same stack size

During handling of the above exception, another exception occurred:

ValueError Traceback (most recent call last)
in ()
357
358 if name == 'main':
--> 359 evaluate_lenet5()
360
361

in evaluate_lenet5(learning_rate, n_epochs, dataset, nkerns, batch_size)
304 if iter % 100 == 0:
305 print('training @ iter = ', iter)
--> 306 cost_ij = train_model(minibatch_index)
307
308 if (iter + 1) % validation_frequency == 0:

~/anaconda3/lib/python3.6/site-packages/theano/compile/function_module.py in call(self, *args, **kwargs)
915 node=self.fn.nodes[self.fn.position_of_error],
916 thunk=thunk,
--> 917 storage_map=getattr(self.fn, 'storage_map', None))
918 else:
919 # old-style linkers raise their own exceptions

~/anaconda3/lib/python3.6/site-packages/theano/gof/link.py in raise_with_op(node, thunk, exc_info, storage_map)
323 # extra long error message in that case.
324 pass
--> 325 reraise(exc_type, exc_value, exc_trace)
326
327

~/anaconda3/lib/python3.6/site-packages/six.py in reraise(tp, value, tb)
690 value = tp()
691 if value.traceback is not tb:
--> 692 raise value.with_traceback(tb)
693 raise value
694 finally:

~/anaconda3/lib/python3.6/site-packages/theano/compile/function_module.py in call(self, *args, **kwargs)
901 try:
902 outputs =
--> 903 self.fn() if output_subset is None else
904 self.fn(output_subset=output_subset)
905 except Exception:

ValueError: CorrMM images and kernel must have the same stack size

Apply node that caused the error: CorrMM{valid, (1, 1), (1, 1), 1 False}(Elemwise{Composite{tanh((i0 + i1))}}.0, Subtensor{::, ::, ::int64, ::int64}.0)
Toposort index: 48
Inputs types: [TensorType(float64, (True, False, False, False)), TensorType(float64, 4D)]
Inputs shapes: [(1, 50, 4, 4), (50, 20, 3, 3)]
Inputs strides: [(6400, 128, 32, 8), (1440, 72, -24, -8)]
Inputs values: ['not shown', 'not shown']
Outputs clients: [[Pool{ignore_border=True, mode='max', ndim=2}(CorrMM{valid, (1, 1), (1, 1), 1 False}.0, TensorConstant{(2,) of 2}, TensorConstant{(2,) of 2}, TensorConstant{(2,) of 0}), MaxPoolGrad{ignore_border=True, mode='max', ndim=2}(CorrMM{valid, (1, 1), (1, 1), 1 False}.0, Pool{ignore_border=True, mode='max', ndim=2}.0, Elemwise{Composite{(i0 * (i1 - sqr(tanh(i2))))}}[(0, 0)].0, TensorConstant{(2,) of 2}, TensorConstant{(2,) of 2}, TensorConstant{(2,) of 0})]]

Backtrace when the node is created(use Theano flag traceback.limit=N to make it longer):
File "/home/eisti/anaconda3/lib/python3.6/site-packages/ipykernel/ipkernel.py", line 208, in do_execute
res = shell.run_cell(code, store_history=store_history, silent=silent)
File "/home/eisti/anaconda3/lib/python3.6/site-packages/ipykernel/zmqshell.py", line 537, in run_cell
return super(ZMQInteractiveShell, self).run_cell(*args, **kwargs)
File "/home/eisti/anaconda3/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2728, in run_cell
interactivity=interactivity, compiler=compiler, result=result)
File "/home/eisti/anaconda3/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2850, in run_ast_nodes
if self.run_code(code, result):
File "/home/eisti/anaconda3/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2910, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "", line 359, in
evaluate_lenet5()
File "", line 200, in evaluate_lenet5
filter_shape=(nkerns[1], nkerns[0], 3, 3)
File "", line 92, in init
input_shape=image_shape

HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and storage map footprint of this apply node.

how to solve?
CorrMM images and kernel must have the same stack size
HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and storage map footprint of this apply node.

commented

@marionleborgne nice to meet you ;). this issue helped me a lot in my theano application which this problem had troubled me a lot