hotpotqa / hotpot

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Size mismatch from rnn

michael20at opened this issue · comments

Hi, after finally getting it to train I got the following error when calling python main.py --mode test --data_split dev --para_limit 2250 --batch_size 24 --init_lr 0.1 --keep_prob 1.0 --sp_lambda 1.0 --save HOTPOT-20190113-103231 --prediction_file dev_distractor_pred.json:

RuntimeError: Error(s) in loading state_dict for SPModel:
        size mismatch for rnn_start.rnns.0.weight_ih_l0: copying a param with shape torch.Size([240, 81]) from checkpoint, the shape in current model is torch.Size([240, 240]).
        size mismatch for rnn_start.rnns.0.weight_ih_l0_reverse: copying a param with shape torch.Size([240, 81]) from checkpoint, the shape in current model is torch.Size([240, 240]).
        size mismatch for rnn_end.rnns.0.weight_ih_l0: copying a param with shape torch.Size([240, 241]) from checkpoint, the shape in current model is torch.Size([240, 240]).
        size mismatch for rnn_end.rnns.0.weight_ih_l0_reverse: copying a param with shape torch.Size([240, 241]) from checkpoint, the shape in current model is torch.Size([240, 240]).
        size mismatch for rnn_type.rnns.0.weight_ih_l0: copying a param with shape torch.Size([240, 241]) from checkpoint, the shape in current model is torch.Size([240, 240]).
        size mismatch for rnn_type.rnns.0.weight_ih_l0_reverse: copying a param with shape torch.Size([240, 241]) from checkpoint, the shape in current model is torch.Size([240, 240]).

Edit: Alright, training finished, but still says episode 0. F1 is at 46.

Any idea why the shapes are different? All help is appreciated, thank you!

One possibility is that you're using a pytorch version that we didn't test on. Are you using pytorch 0.3.0?