kermitt2 / delft

a Deep Learning Framework for Text

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Incompatible arrays dimension when using ELMo and input is of length 1 (only 1 word)

oterrier opened this issue · comments

In a sequence labelling scenario, when the input of Tagger.tag() is of length 1, like for example with the input string "test" we end up with an error:

  File "/home/olivier/.pyenv/versions/spacy/lib/python3.6/site-packages/delft/sequenceLabelling/wrapper.py", line 245, in tag
    annotations = tagger.tag(texts, output_format)
  File "/home/olivier/.pyenv/versions/spacy/lib/python3.6/site-packages/delft/sequenceLabelling/tagger.py", line 60, in tag
    preds = self.model.predict_on_batch(generator_output[0])
  File "/home/olivier/.pyenv/versions/spacy/lib/python3.6/site-packages/keras/engine/training.py", line 1274, in predict_on_batch
    outputs = self.predict_function(ins)
  File "/home/olivier/.pyenv/versions/spacy/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 2715, in __call__
    return self._call(inputs)
  File "/home/olivier/.pyenv/versions/spacy/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 2675, in _call
    fetched = self._callable_fn(*array_vals)
  File "/home/olivier/.pyenv/versions/spacy/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1439, in __call__
    run_metadata_ptr)
  File "/home/olivier/.pyenv/versions/spacy/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 528, in __exit__
    c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: ConcatOp : Dimensions of inputs should match: shape[0] = [1,1,1324] vs. shape[1] = [1,2,50]
	 [[{{node concatenate_1/concat}}]]

The word input shape [1,1,1324] looks fine :
1 because 1 word
1324 because 300 for glove embeddings and 1024 for ELMo

But the character input shape [1,2,50] is not aligned

The reason why is in the DataGenerator.__data_generation method() near line 93

        # prevent sequence of length 1 alone in a batch (this causes an error in tf)
        extend = False
        if max_length_x == 1:
            max_length_x += 1
            extend = True

An input of length 1 is artificially extended to 2

But line 108 :

        if self.embeddings.use_ELMo:     
            #batch_x = to_vector_elmo(x_tokenized, self.embeddings, max_length_x)
            batch_x = to_vector_simple_with_elmo(x_tokenized, self.embeddings, max_length_x)

the batch_x is initialized with the correct shape [1,2,1324] but after the call to to_vector_simple_with_elmo() it is back to [1,1,1324]

So maybe the to_vector_simple_with_elmo() method should also extend the vector from 1 to 2 ?

Don't know what is the best fix :

  • pass an additional extend=True parameter to the to_vector_simple_with_elmo() method ?
  • take the maxlength into account in to_vector_simple_with_elmo() ?

I would be happy to contribute the fix in a PR if you tell me what would be the best way to fix

Thank you @oterrier ! Indeed when using ELMo, the batch with input of length 1 is not extended as it should be as workaround the TF error.

I think for fixing that, we would need the additional extend=True parameter to the to_vector_simple_with_elmo() method (as done with transform) - so your first solution. The reason is that it's the min of the actual max length of the batch (in this case 1) and maxlength (in this case 2) which is used in to_vector_simple_with_elmo so it would not work the general case and we need an explicit parameter for the "artificial" extension to 2.

The PR would be really great !! Thank you again.

Ok thanks for accepting this fix, I think you can close the issue now