Incompatible arrays dimension when using ELMo and input is of length 1 (only 1 word)
oterrier opened this issue · comments
In a sequence labelling scenario, when the input of Tagger.tag() is of length 1, like for example with the input string "test" we end up with an error:
File "/home/olivier/.pyenv/versions/spacy/lib/python3.6/site-packages/delft/sequenceLabelling/wrapper.py", line 245, in tag
annotations = tagger.tag(texts, output_format)
File "/home/olivier/.pyenv/versions/spacy/lib/python3.6/site-packages/delft/sequenceLabelling/tagger.py", line 60, in tag
preds = self.model.predict_on_batch(generator_output[0])
File "/home/olivier/.pyenv/versions/spacy/lib/python3.6/site-packages/keras/engine/training.py", line 1274, in predict_on_batch
outputs = self.predict_function(ins)
File "/home/olivier/.pyenv/versions/spacy/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 2715, in __call__
return self._call(inputs)
File "/home/olivier/.pyenv/versions/spacy/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 2675, in _call
fetched = self._callable_fn(*array_vals)
File "/home/olivier/.pyenv/versions/spacy/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1439, in __call__
run_metadata_ptr)
File "/home/olivier/.pyenv/versions/spacy/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 528, in __exit__
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: ConcatOp : Dimensions of inputs should match: shape[0] = [1,1,1324] vs. shape[1] = [1,2,50]
[[{{node concatenate_1/concat}}]]
The word input shape [1,1,1324] looks fine :
1 because 1 word
1324 because 300 for glove embeddings and 1024 for ELMo
But the character input shape [1,2,50] is not aligned
The reason why is in the DataGenerator.__data_generation method() near line 93
# prevent sequence of length 1 alone in a batch (this causes an error in tf)
extend = False
if max_length_x == 1:
max_length_x += 1
extend = True
An input of length 1 is artificially extended to 2
But line 108 :
if self.embeddings.use_ELMo:
#batch_x = to_vector_elmo(x_tokenized, self.embeddings, max_length_x)
batch_x = to_vector_simple_with_elmo(x_tokenized, self.embeddings, max_length_x)
the batch_x is initialized with the correct shape [1,2,1324] but after the call to to_vector_simple_with_elmo() it is back to [1,1,1324]
So maybe the to_vector_simple_with_elmo() method should also extend the vector from 1 to 2 ?
Don't know what is the best fix :
- pass an additional extend=True parameter to the to_vector_simple_with_elmo() method ?
- take the maxlength into account in to_vector_simple_with_elmo() ?
I would be happy to contribute the fix in a PR if you tell me what would be the best way to fix
Thank you @oterrier ! Indeed when using ELMo, the batch with input of length 1 is not extended as it should be as workaround the TF error.
I think for fixing that, we would need the additional extend=True
parameter to the to_vector_simple_with_elmo() method (as done with transform
) - so your first solution. The reason is that it's the min of the actual max length of the batch (in this case 1) and maxlength (in this case 2) which is used in to_vector_simple_with_elmo
so it would not work the general case and we need an explicit parameter for the "artificial" extension to 2.
The PR would be really great !! Thank you again.
Ok thanks for accepting this fix, I think you can close the issue now