yoonkim / CNN_sentence

CNNs for sentence classification

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

NotImplementedError: The image and the kernel must have the same type.inputs

andyyuan78 opened this issue · comments

ubgpu@ubgpu:/github/CNN_sentence$ sudo python conv_net_sentence.py -nonstatic -word2vec
Using gpu device 0: GeForce GTX 970
loading data... data loaded!
model architecture: CNN-non-static
using: word2vec vectors
[('image shape', 64, 300), ('filter shape', [(100, 1, 3, 300), (100, 1, 4, 300), (100, 1, 5, 300)]), ('hidden_units', [100, 2]), ('dropout', [0.5]), ('batch_size', 50), ('non_static', True), ('learn_decay', 0.95), ('conv_non_linear', 'relu'), ('non_static', True), ('sqr_norm_lim', 9), ('shuffle_batch', True)]
Traceback (most recent call last):
File "conv_net_sentence.py", line 317, in
dropout_rate=[0.5])
File "conv_net_sentence.py", line 88, in train_conv_net
filter_shape=filter_shape, poolsize=pool_size, non_linear=conv_non_linear)
File "conv_net_classes.py", line 390, in init
conv_out = conv.conv2d(input=input, filters=self.W,filter_shape=self.filter_shape, image_shape=self.image_shape)
File "/usr/local/lib/python2.7/dist-packages/theano/tensor/nnet/conv.py", line 151, in conv2d
return op(input, filters)
File "/usr/local/lib/python2.7/dist-packages/theano/gof/op.py", line 507, in call
node = self.make_node(_inputs, *_kwargs)
File "/usr/local/lib/python2.7/dist-packages/theano/tensor/nnet/conv.py", line 628, in make_node
"inputs(%s), kerns(%s)" % (_inputs.dtype, _kerns.dtype))
NotImplementedError: The image and the kernel must have the same type.inputs(float64), kerns(float32)
ubgpu@ubgpu:
/github/CNN_sentence$

Hi,andyyuan78, I met the same problem, have your fixed it?

no response yet;

i am not try to fix it

Try forcing the type of the word matrices created in process_data.py to be float32 ...

from

  • W = np.zeros(shape=(vocab_size+1, k))
  • W[0] = np.zeros(k)

to

  • W = np.zeros(shape=(vocab_size+1, k), dtype='float32')
  • W[0] = np.zeros(k, dtype='float32')

not work yet:

ubgpu@ubgpu:/github/CNN_sentence$ sudo python conv_net_sentence.py -nonstatic -word2vec
Using gpu device 0: GeForce GTX 970
loading data... data loaded!
model architecture: CNN-non-static
using: word2vec vectors
[('image shape', 64, 300), ('filter shape', [(100, 1, 3, 300), (100, 1, 4, 300), (100, 1, 5, 300)]), ('hidden_units', [100, 2]), ('dropout', [0.5]), ('batch_size', 50), ('non_static', True), ('learn_decay', 0.95), ('conv_non_linear', 'relu'), ('non_static', True), ('sqr_norm_lim', 9), ('shuffle_batch', True)]
Traceback (most recent call last):
File "conv_net_sentence.py", line 317, in
dropout_rate=[0.5])
File "conv_net_sentence.py", line 88, in train_conv_net
filter_shape=filter_shape, poolsize=pool_size, non_linear=conv_non_linear)
File "conv_net_classes.py", line 390, in init
conv_out = conv.conv2d(input=input, filters=self.W,filter_shape=self.filter_shape, image_shape=self.image_shape)
File "/usr/local/lib/python2.7/dist-packages/theano/tensor/nnet/conv.py", line 151, in conv2d
return op(input, filters)
File "/usr/local/lib/python2.7/dist-packages/theano/gof/op.py", line 507, in call
node = self.make_node(_inputs, *_kwargs)
File "/usr/local/lib/python2.7/dist-packages/theano/tensor/nnet/conv.py", line 628, in make_node
"inputs(%s), kerns(%s)" % (_inputs.dtype, _kerns.dtype))
NotImplementedError: The image and the kernel must have the same type.inputs(float64), kerns(float32)
ubgpu@ubgpu:
/github/CNN_sentence$ git diff
diff --git a/process_data.py b/process_data.py
index b7b7851..fb5f31e 100644
--- a/process_data.py
+++ b/process_data.py
@@ -52,8 +52,9 @@ def get_W(word_vecs, k=300):
"""
vocab_size = len(word_vecs)
word_idx_map = dict()

  • W = np.zeros(shape=(vocab_size+1, k))
  • W[0] = np.zeros(k)
  • W = np.zeros(shape=(vocab_size+1, k), dtype='float32')
  • W[0] = np.zeros(k, dtype='float32')

i = 1
for word in word_vecs:
W[i] = word_vecs[word]
ubgpu@ubgpu:~/github/CNN_sentence$

even I chang it to float64, not work yet

ubgpu@ubgpu:/github/CNN_sentence$ sudo python conv_net_sentence.py -nonstatic -word2vec
Using gpu device 0: GeForce GTX 970
loading data... data loaded!
model architecture: CNN-non-static
using: word2vec vectors
[('image shape', 64, 300), ('filter shape', [(100, 1, 3, 300), (100, 1, 4, 300), (100, 1, 5, 300)]), ('hidden_units', [100, 2]), ('dropout', [0.5]), ('batch_size', 50), ('non_static', True), ('learn_decay', 0.95), ('conv_non_linear', 'relu'), ('non_static', True), ('sqr_norm_lim', 9), ('shuffle_batch', True)]
Traceback (most recent call last):
File "conv_net_sentence.py", line 317, in
dropout_rate=[0.5])
File "conv_net_sentence.py", line 88, in train_conv_net
filter_shape=filter_shape, poolsize=pool_size, non_linear=conv_non_linear)
File "conv_net_classes.py", line 390, in init
conv_out = conv.conv2d(input=input, filters=self.W,filter_shape=self.filter_shape, image_shape=self.image_shape)
File "/usr/local/lib/python2.7/dist-packages/theano/tensor/nnet/conv.py", line 151, in conv2d
return op(input, filters)
File "/usr/local/lib/python2.7/dist-packages/theano/gof/op.py", line 507, in call
node = self.make_node(_inputs, *_kwargs)
File "/usr/local/lib/python2.7/dist-packages/theano/tensor/nnet/conv.py", line 628, in make_node
"inputs(%s), kerns(%s)" % (_inputs.dtype, _kerns.dtype))
NotImplementedError: The image and the kernel must have the same type.inputs(float64), kerns(float32)
ubgpu@ubgpu:
/github/CNN_sentence$ git diff
diff --git a/process_data.py b/process_data.py
index b7b7851..4bb6491 100644
--- a/process_data.py
+++ b/process_data.py
@@ -52,8 +52,9 @@ def get_W(word_vecs, k=300):
"""
vocab_size = len(word_vecs)
word_idx_map = dict()

  • W = np.zeros(shape=(vocab_size+1, k))
  • W[0] = np.zeros(k)
  • W = np.zeros(shape=(vocab_size+1, k), dtype='float64')
  • W[0] = np.zeros(k, dtype='float64')

i = 1
for word in word_vecs:
W[i] = word_vecs[word]
ubgpu@ubgpu:~/github/CNN_sentence$

When including floatX=float32 in the THEANO_FLAGS, I met the same issue. However, without specifying floatX, the code works on my Mac Book Pro. However, Yoon's code doesn't print out running durations. CNN on GPU by using cuDNN 3 seems not faster than running on CPU - need benchmark the duration differences later.

Having this same problem on many system configurations. I've yet to get the code working with GPU at all. Can Someone that has a working setup post their configuration?

changing floatX to float32 doesn't do anything, @leocnj I believe the reason you did not see a speed improvement is because when you switch to float64 although the error goes away, the GPU still fails to be utilized. You can check GPU utilization with nvidia-smi command.

--Im trying to run this on an amazon g2 instance (GRID gpu)

For run this code on gpu (float32) you need to modify,
process_data.py
lin 55, W = np.zeros(shape=(vocab_size+1, k), dtype='float32')
lin 56, W[0] = np.zeros(k, dtype='float32')

conv_net_sentence.py
lin 82, set_zero = theano.function([zero_vec_tensor], updates=[(Words, T.set_subtensor(Words[0,:], zero_vec_tensor))], allow_input_downcast=True)
lin131, val_model = theano.function([index], classifier.errors(y),
givens={
x: val_set_x[index * batch_size: (index + 1) * batch_size],
y: val_set_y[index * batch_size: (index + 1) * batch_size]}, allow_input_downcast=True)
lin 137, test_model = theano.function([index], classifier.errors(y),
givens={
x: train_set_x[index * batch_size: (index + 1) * batch_size],
y: train_set_y[index * batch_size: (index + 1) * batch_size]}, allow_input_downcast=True)
lin 141, train_model = theano.function([index], cost, updates=grad_updates,
givens={
x: train_set_x[index_batch_size:(index+1)_batch_size],
y: train_set_y[index_batch_size:(index+1)_batch_size]}, allow_input_downcast=True)
lin 155, test_model_all = theano.function([x,y], test_error, allow_input_downcast=True)

My results on GPU (THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 python conv_net_sentence.py -static -rand):
epoch 1, train perf 60.150289 %, val perf 58.105263
epoch 2, train perf 72.936416 %, val perf 67.368421
epoch 3, train perf 75.213873 %, val perf 63.473684
epoch 4, train perf 87.803468 %, val perf 70.947368
epoch 5, train perf 93.248555 %, val perf 70.421053
Looping 5 times took 133.631452 seconds
For CPU (THEANO_FLAGS=mode=FAST_RUN,device=cpu,floatX=float32 python conv_net_sentence.py -static -rand):
epoch 1, train perf 60.774566 %, val perf 58.842105
epoch 2, train perf 72.994220 %, val perf 67.263158
epoch 3, train perf 74.809249 %, val perf 62.947368
epoch 4, train perf 88.080925 %, val perf 69.473684
epoch 5, train perf 92.751445 %, val perf 69.894737
Looping 5 times took 690.696883 seconds
cv: 0, perf: 0.716417910448

I had modified the file as the way @manuelvargas760 said, It works!
Note: after modified process_data.py, you should run the process_data.py to get new model parameters and word vectors

Feel free push if you've modified to code get GPU working, and I'll make sure to merge :)