LiyuanLucasLiu / Transformer-Clinic

Understanding the Difficulty of Training Transformers

Home Page:https://arxiv.org/abs/2004.08249

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

`RuntimeError: expected scalar type Float but found Half` during the eval step

ruiningh opened this issue · comments

I was running the given script for ADMIN on en-de dataset. It throws an error at the last step which evaluates the model using the averaged checkpoint.

Traceback (most recent call last):
File "/home/.../bin/fairseq-generate", line 11, in
load_entry_point('fairseq', 'console_scripts', 'fairseq-generate')()
File "/home/.../fairseq/fairseq_cli/generate.py", line 197, in cli_main
main(args)
File "/home/.../fairseq/fairseq_cli/generate.py", line 111, in main
hypos = task.inference_step(generator, models, sample, prefix_tokens)
File "/home/.../fairseq/fairseq/tasks/fairseq_task.py", line 277, in inference_step
return generator.generate(models, sample, prefix_tokens=prefix_tokens)
File "/home/.../lib/python3.7/site-packages/torch/autograd/grad_mode.py", line 15, in decorate_context
return func(*args, **kwargs)
File "/home/.../fairseq/fairseq/sequence_generator.py", line 113, in generate
return self._generate(model, sample, **kwargs)
File "/home/.../lib/python3.7/site-packages/torch/autograd/grad_mode.py", line 15, in decorate_context
return func(*args, **kwargs)
File "/home/.../fairseq/fairseq/sequence_generator.py", line 152, in _generate
encoder_outs = model.forward_encoder(encoder_input)
File "/home/.../lib/python3.7/site-packages/torch/autograd/grad_mode.py", line 15, in decorate_context
return func(*args, **kwargs)
File "/home/.../fairseq/fairseq/sequence_generator.py", line 540, in forward_encoder
return [model.encoder(**encoder_input) for model in self.models]
File "/home/.../fairseq/fairseq/sequence_generator.py", line 540, in
return [model.encoder(**encoder_input) for model in self.models]
File "/home/.../lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/.../fairseq/fairseq/models/transformer.py", line 369, in forward
x = layer(x, encoder_padding_mask)
File "/home/.../lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/.../fairseq/fairseq/modules/transformer_layer.py", line 163, in forward
x, _ = self.self_attn(query=x, key=x, value=x, key_padding_mask=encoder_padding_mask)
File "/home/.../lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/.../fairseq/fairseq/modules/multihead_attention.py", line 141, in forward
q, k, v = self.in_proj_qkv(query)
File "/home/.../fairseq/fairseq/modules/multihead_attention.py", line 269, in in_proj_qkv
return self._in_proj(query).chunk(3, dim=-1)
File "/home/.../fairseq/fairseq/modules/multihead_attention.py", line 306, in _in_proj
return F.linear(input, weight, bias)
File "/home/.../lib/python3.7/site-packages/torch/nn/functional.py", line 1676, in linear
output = input.matmul(weight.t())
RuntimeError: expected scalar type Float but found Half

(Part of the path info is replaced with ... for privacy concerns. They are not useful for debugging purposes anyway.)

Thanks for the quick reply! I just tried removing the --fp16, however a new error came up. Any ideas why this happened?

I'm using Python 3.7.6 and PyTorch 1.6.0+cu101. Thanks a lot!

Traceback (most recent call last):
File "/home/.../bin/fairseq-generate", line 11, in
load_entry_point('fairseq', 'console_scripts', 'fairseq-generate')()
File "/home/.../fairseq/fairseq_cli/generate.py", line 197, in cli_main
main(args)
File "/home/.../fairseq/fairseq_cli/generate.py", line 111, in main
hypos = task.inference_step(generator, models, sample, prefix_tokens)
File "/home/.../fairseq/fairseq/tasks/fairseq_task.py", line 277, in inference_step
return generator.generate(models, sample, prefix_tokens=prefix_tokens)
File "/home/.../lib/python3.7/site-packages/torch/autograd/grad_mode.py", line 15, in decorate_context
return func(*args, **kwargs)
File "/home/.../fairseq/fairseq/sequence_generator.py", line 113, in generate
return self._generate(model, sample, **kwargs)
File "/home/.../lib/python3.7/site-packages/torch/autograd/grad_mode.py", line 15, in decorate_context
return func(*args, **kwargs)
File "/home/.../fairseq/fairseq/sequence_generator.py", line 378, in _generate
scores.view(bsz, beam_size, -1)[:, :, :step],
File "/home/.../fairseq/fairseq/search.py", line 81, in step
torch.div(self.indices_buf, vocab_size, out=self.beams_buf)
RuntimeError: Integer division of tensors using div or / is no longer supported, and in a future release div will perform true division as in Python 3. Use true_divide or floor_divide (// in Python) instead.

Are we supposed to do something like this to get around it? :-)

Emmm, I don't think I met this error before but the solution seems reasonable. Wondering whether you have fixed this issue.

That solved the problem. Thanks Liyuan!