IndexError: too many indices for tensor of dimension 1
mfelice opened this issue · comments
Mariano Felice commented
Hi there,
I'm using the PyTorch implementation with bert-base-uncased
and I get the following error when the sentence contains only one token:
Traceback (most recent call last):
File "bert.py", line 28, in <module>
print(scorer.score_sentences(["Hello"]))
File ".../mlm-scoring/src/mlm/scorers.py", line 167, in score_sentences
return self.score(corpus, **kwargs)[0]
File ".../mlm-scoring/src/mlm/scorers.py", line 757, in score
out = out[list(range(split_size)), token_masked_ids]
IndexError: too many indices for tensor of dimension 1
It works fine with MXNet MLMs, but I need to use a community model from HuggingFace.
Thanks!
Mariano Felice commented
OK, I think I found the problem.
mlm-scoring/src/mlm/scorers.py
Line 727 in 6727297
should be changed to:
out = torch.reshape(out[0], (out[0].shape[0], -1))
squeeze()
was removing a dimension that should be preserved.
Darren Abramson commented
Hurray for publicly licensed software and donation of labour to the public good!