jiasenlu / NeuralBabyTalk

Pytorch code of for our CVPR 2018 paper "Neural Baby Talk"

Home Page:https://arxiv.org/abs/1803.09845

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Expected torch.cuda.FloatTensor but found torch.cuda.LongTensor

yuhuan-wu opened this issue · comments

Hi !
I have a strange problem.
The evaluation of flick30k in the README works well:

python main.py --path_opt cfgs/normal_flickr_res101.yml --batch_size 20 --cuda True --num_workers 20 --max_epoch 30 --inference_only True --beam_size 3 --start_from save/flickr30k_nbt_1024

However, when I run the train code:

python main.py --path_opt cfgs/normal_flickr_res101.yml --batch_size 20 --cuda True --num_workers 20 --max_epoch 30

Then:

Tensorflow not installed; No tensorboard logging.
Namespace(att_feat_size=2048, att_hid_size=512, att_model='topdown', batch_size=20, beam_size=1, cached_tokens='flickr30k-train-idxs', cbs=False, cbs_mode='all', cbs_tag_size=3, checkpoint_path='save/normal_flickr_1024_adam', cider_df='corpus', cnn_backend='res101', cnn_learning_rate=1e-05, cnn_optim='adam', cnn_optim_alpha=0.8, cnn_optim_beta=0.999, cnn_weight_decay=0, cuda=True, data_path='data', dataset='flickr30k', decode_noc=False, det_oracle=False, disp_interval=100, drop_prob_lm=0.5, fc_feat_size=2048, finetune_cnn=False, fixed_block=1, grad_clip=0.1, id='', image_crop_size=512, image_path='data/flickr30k/flickr30k_images', image_size=576, inference_only=False, input_dic='data/flickr30k/dic_flickr30k.json', input_encoding_size=512, input_json='data/flickr30k/cap_flickr30k.json', language_eval=1, learning_rate=0.0005, learning_rate_decay_every=3, learning_rate_decay_rate=0.8, learning_rate_decay_start=1, load_best_score=1, losses_log_every=10, mGPUs=False, max_epochs=30, num_layers=1, num_workers=20, optim='adam', optim_alpha=0.9, optim_beta=0.999, optim_epsilon=1e-08, path_opt='cfgs/normal_flickr_res101.yml', proposal_h5='data/flickr30k/flickr30k_detection.h5', rnn_size=1024, rnn_type='lstm', scheduled_sampling_increase_every=5, scheduled_sampling_increase_prob=0.05, scheduled_sampling_max_prob=0.25, scheduled_sampling_start=-1, self_critical=False, seq_length=20, seq_per_img=5, start_from=None, val_every_epoch=3, val_images_use=-1, val_split='train', weight_decay=0)
DataLoader loading json file:  data/flickr30k/dic_flickr30k.json
vocab size is  8639
DataLoader loading json file:  data/flickr30k/cap_flickr30k.json
DataLoader loading proposal file:  data/flickr30k/flickr30k_detection.h5
assigned 29000 images to split train
DataLoader loading json file:  data/flickr30k/dic_flickr30k.json
vocab size is  8639
DataLoader loading json file:  data/flickr30k/cap_flickr30k.json
DataLoader loading proposal file:  data/flickr30k/flickr30k_detection.h5
assigned 29000 images to split train
Loading pretrained weights from data/imagenet_weights/resnet101.pth
Use adam as optmization method
/media/wyh/yuhuan/wyh/NeuralBabyTalk/misc/model.py:174: UserWarning: volatile was removed and now has no effect. Use `with torch.no_grad():` instead.
  conv_feats, fc_feats = self.cnn(Variable(img.data, volatile=True))
Traceback (most recent call last):
  File "main.py", line 354, in <module>
    train(epoch, opt)
  File "main.py", line 73, in train
    lm_loss, bn_loss, fg_loss = model(input_imgs, input_seqs, gt_seqs, input_num, input_ppls, gt_bboxs, mask_bboxs, 'MLE')
  File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/module.py", line 491, in __call__
    result = self.forward(*input, **kwargs)
  File "/media/wyh/yuhuan/wyh/NeuralBabyTalk/misc/model.py", line 126, in forward
    return self._forward(img, seq, ppls, gt_boxes, mask_boxes, num)
  File "/media/wyh/yuhuan/wyh/NeuralBabyTalk/misc/model.py", line 276, in _forward
    lm_loss = self.critLM(decoded, vis_prob, seq_update[:,1:seq_cnt+1, 0].clone())
  File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/module.py", line 491, in __call__
    result = self.forward(*input, **kwargs)
  File "/media/wyh/yuhuan/wyh/NeuralBabyTalk/misc/utils.py", line 231, in forward
    loss = (torch.sum(txt_out)+torch.sum(vis_out)) / (torch.sum(txt_mask.data) + torch.sum(vis_mask.data))
RuntimeError: Expected object of type torch.cuda.FloatTensor but found type torch.cuda.LongTensor for argument #2 'other'
Exception NameError: "global name 'FileNotFoundError' is not defined" in <bound method _DataLoaderIter.__del__ of <torch.utils.data.dataloader._DataLoaderIter object at 0x7f3f58b39d50>> ignored

The error is:

RuntimeError: Expected object of type torch.cuda.FloatTensor but found type torch.cuda.LongTensor for argument #2 'other'

You can try " loss = (torch.sum(txt_out)+torch.sum(vis_out)) / (torch.sum(txt_mask.data) + torch.sum(vis_mask.data)).float()"" to solve the issue.

Where did you find the file normal_flickr_res101.yml?

I can't find this file normal_flickr_res101.yml .where is it?

Where did you find the file normal_flickr_res101.yml?

how did you slove it?