[Testing Transfer Learning] I cannot reproduce results on Novel classes only
egmaminta opened this issue · comments
Good day! First, I'd like to say great work on this!
As I was trying to reproduce the results found here, I'd like to focus on COCO (Novel, 31.4) and LVIS (Novel, 22.0).
Shown below is the bash script I'm using to test your fine-tuned open-vocabulary detector on COCO.
python3 ./tools/train_net.py \
--eval-only \
--num-gpus 4 \
--config-file ./configs/COCO-InstanceSegmentation/CLIP_fast_rcnn_R_50_C4_ovd_testt.yaml \
MODEL.WEIGHTS ./pretrained_ckpt/regionclip/regionclip_finetuned-coco_rn50.pth \
MODEL.CLIP.OFFLINE_RPN_CONFIG ./configs/COCO-InstanceSegmentation/mask_rcnn_R_50_C4_1x_ovd_FSD.yaml \
MODEL.CLIP.BB_RPN_WEIGHTS ./pretrained_ckpt/rpn/rpn_coco_48.pth \
MODEL.CLIP.TEXT_EMB_PATH ./pretrained_ckpt/concept_emb/coco_48_base_cls_emb.pth \
MODEL.CLIP.OPENSET_TEST_TEXT_EMB_PATH ./pretrained_ckpt/concept_emb/coco_17_target_cls_emb.pth \
MODEL.ROI_HEADS.SOFT_NMS_ENABLED True \
After doing the inference, I get really, really low scores like ~0.0019 AP. May I respectfully ask if I missed anything?
Hoping for your kind response. Thank you.
All the best!
same problem
The following print appears when the weight is loading:
WARNING [08/31 16:16:06 fvcore.common.checkpoint]: Some model parameters or buffers are not found in the checkpoint:
backbone.attnpool.c_proj.{bias, weight}
backbone.attnpool.k_proj.{bias, weight}
backbone.attnpool.positional_embedding
backbone.attnpool.q_proj.{bias, weight}
backbone.attnpool.v_proj.{bias, weight}
backbone.bn1.{bias, weight}
backbone.bn2.{bias, weight}
backbone.bn3.{bias, weight}
backbone.conv1.weight
...
This is a problem caused by the pytorch version. The Pytorch 2.0 I used had a zero indicator problem. Didn't look closely at what went wrong, but I returned Pytorch1.9 to get normal metrics
This is a problem caused by the pytorch version. The Pytorch 2.0 I used had a zero indicator problem. Didn't look closely at what went wrong, but I returned Pytorch1.9 to get normal metrics
Did you downgrade your PyTorch from 2.0 to 1.9? Wouldn't this be in conflict with Detectron2 (since latest requirest 11.8 CUDA version)?
CUDA version follow pytorch version
CUDA version follow pytorch version
Would it be possible to show here the steps on how you did it specifically? I tried to downgrade my PyTorch version but I would then encounter mismatch with Detectron2. Would gladly appreciate any help!
As mentioned by the author in install.md, after reinstalling pytorch, you need to remove the build directory under the project and reinstall detectron2
As mentioned by the author in install.md, after reinstalling pytorch, you need to remove the build directory under the project and reinstall detectron2
I did rebuild my detectron2 after using rm -rf build/ **/*.so
... However, I get this error:
RuntimeError:
The detected CUDA version (11.8) mismatches the version that was used to compile
PyTorch (11.1). Please make sure to use the same CUDA versions.
Apologies, for any inconvenience. Appreciate any help!
@egmaminta @Hiram1026 Sorry for bothering you, but I downgraded pytorch to 1.9 and rebuild like the instruction but I still have the problem. Could you provide your env
@Hiram1026 Thank you very much! I run the transferring learning example COCO again, and it works for me :)
And I also want to notify you guys, that Pytorch 1.13, which is not suitable, is my previous environment. When I ran RegionClip in this env, it came out with the same problem as @egmaminta
@egmaminta @Hiram1026 Sorry for bothering you, but I downgraded pytorch to 1.9 and rebuild like the instruction but I still have the problem. Could you provide your env
Have you solved? I just solved. create new conda environment, download torch like mentioned before,and userm -rf build/ **/*.so
,rebuild again.
@egmaminta @Hiram1026 Sorry for bothering you, but I downgraded pytorch to 1.9 and rebuild like the instruction but I still have the problem. Could you provide your env
Have you solved? I just solved. create new conda environment, download torch like mentioned before,and use
rm -rf build/ **/*.so
,rebuild again.
Thank you for your reply, but I no longer work on it.