Finetuning the model

Question

Finetuning the model

hemmingstein opened this issue 9 years ago · comments

Hello again,
after fixing the learning rate problem, I struggle with the next one: I come to the "finetuning the model" step and then there is this error:

"Traceback (most recent call last):
File "pdnn/cmds/run_CNN.py", line 93, in
train_error = train_sgd(train_fn, cfg)
File "pdnn/learning/sgd.py", line 72, in train_sgd
train_error.append(train_fn(index=batch_index, learning_rate = learning_rate, momentum = momentum))
File "/usr/local/lib/python2.7/dist-packages/Theano-0.7.0-py2.7.egg/theano/compile/function_module.py", line 606, in call
storage_map=self.fn.storage_map)
File "/usr/local/lib/python2.7/dist-packages/Theano-0.7.0-py2.7.egg/theano/compile/function_module.py", line 595, in call
outputs = self.fn()
ValueError: total size of new array must be unchanged
Apply node that caused the error: Reshape{4}(Subtensor{int64:int64:}.0, TensorConstant{[256 1 28 28]})
Inputs types: [TensorType(float64, matrix), TensorType(int64, vector)]
Inputs shapes: [(256, 40), (4,)]
Inputs strides: [(320, 8), (8,)]
Inputs values: ['not shown', array([256, 1, 28, 28])]"

I'm a bit puzzled by this, can you please help me?

Deleted user · Answer 1 · Wed Jun 17 2015 00:19:10 GMT+0800 (China Standard Time)

Could you paste the full command line you run?

hemmingstein · Answer 2 · Wed Jun 17 2015 15:09:31 GMT+0800 (China Standard Time)

Yeah, here it comes (I added newlines for readablitiy):

python pdnn/cmds/run_CNN.py
--train-data "train.pfile"
--valid-data "dev.pfile"
--conv-nnet-spec "1x28x28:20,5x5,p2x2:50,5x5,p2x2,f"
--nnet-spec "512:10"
--wdir ./
--l2-reg 0.0001
--lrate "C:0.125:20"
--model-save-step 20
--param-output-file cnn.param
--cfg-output-file cnn.cfg

Deleted user · Answer 3 · Thu Jun 18 2015 09:39:49 GMT+0800 (China Standard Time)

I just tested the latest version on both GPUs and CPUs, and didn't see any problems alike.

For CNN, PDNN has the requirement that you cannot change the batch size after the fine-tuning function is compiled. My interpretation of the error message is that by default, the mini-batch size is set to 256. However, during execution, the batch size is interpreted as a value not equal to 256 anymore. The cause of this difference is beyond me though. My guess is it's due to your compiler, the same reason as your last post.

hemmingstein · Answer 4 · Thu Jun 18 2015 17:00:22 GMT+0800 (China Standard Time)

Thanks anyway!

Patrick Laflamme · Answer 5 · Thu Jan 28 2016 08:06:01 GMT+0800 (China Standard Time)

I had an error like yours, I discovered that there was an old nnet.tmp and training_state.tmp that was sitting in the same directory, that had different net dimensions. this is was cause the error. Simply deleting those files did the trick!

hemmingstein · Answer 6 · Thu Jan 28 2016 16:04:38 GMT+0800 (China Standard Time)

Thanks, I'll try it.