microsoft / RegionCLIP

[CVPR 2022] Official code for "RegionCLIP: Region-based Language-Image Pretraining"

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[Testing Transfer Learning] I cannot reproduce results on Novel classes only

egmaminta opened this issue · comments

Good day! First, I'd like to say great work on this!

As I was trying to reproduce the results found here, I'd like to focus on COCO (Novel, 31.4) and LVIS (Novel, 22.0).

Shown below is the bash script I'm using to test your fine-tuned open-vocabulary detector on COCO.

python3 ./tools/train_net.py \
--eval-only  \
--num-gpus 4 \
--config-file ./configs/COCO-InstanceSegmentation/CLIP_fast_rcnn_R_50_C4_ovd_testt.yaml \
MODEL.WEIGHTS ./pretrained_ckpt/regionclip/regionclip_finetuned-coco_rn50.pth \
MODEL.CLIP.OFFLINE_RPN_CONFIG ./configs/COCO-InstanceSegmentation/mask_rcnn_R_50_C4_1x_ovd_FSD.yaml \
MODEL.CLIP.BB_RPN_WEIGHTS ./pretrained_ckpt/rpn/rpn_coco_48.pth \
MODEL.CLIP.TEXT_EMB_PATH ./pretrained_ckpt/concept_emb/coco_48_base_cls_emb.pth \
MODEL.CLIP.OPENSET_TEST_TEXT_EMB_PATH ./pretrained_ckpt/concept_emb/coco_17_target_cls_emb.pth \
MODEL.ROI_HEADS.SOFT_NMS_ENABLED True \

After doing the inference, I get really, really low scores like ~0.0019 AP. May I respectfully ask if I missed anything?

Hoping for your kind response. Thank you.

All the best!

Here's a screenshot of my results, by the way...

image

I tried running RN50 (Generalized) from test_transfer_learning.sh. Here are the results:

image

Still really, really far from the results expected. Hoping you could share some guidance for this? Thank you.

same problem

The following print appears when the weight is loading:

WARNING [08/31 16:16:06 fvcore.common.checkpoint]: Some model parameters or buffers are not found in the checkpoint:
backbone.attnpool.c_proj.{bias, weight}
backbone.attnpool.k_proj.{bias, weight}
backbone.attnpool.positional_embedding
backbone.attnpool.q_proj.{bias, weight}
backbone.attnpool.v_proj.{bias, weight}
backbone.bn1.{bias, weight}
backbone.bn2.{bias, weight}
backbone.bn3.{bias, weight}
backbone.conv1.weight
...

This is a problem caused by the pytorch version. The Pytorch 2.0 I used had a zero indicator problem. Didn't look closely at what went wrong, but I returned Pytorch1.9 to get normal metrics

This is a problem caused by the pytorch version. The Pytorch 2.0 I used had a zero indicator problem. Didn't look closely at what went wrong, but I returned Pytorch1.9 to get normal metrics

Did you downgrade your PyTorch from 2.0 to 1.9? Wouldn't this be in conflict with Detectron2 (since latest requirest 11.8 CUDA version)?

CUDA version follow pytorch version

CUDA version follow pytorch version

Would it be possible to show here the steps on how you did it specifically? I tried to downgrade my PyTorch version but I would then encounter mismatch with Detectron2. Would gladly appreciate any help!

As mentioned by the author in install.md, after reinstalling pytorch, you need to remove the build directory under the project and reinstall detectron2

As mentioned by the author in install.md, after reinstalling pytorch, you need to remove the build directory under the project and reinstall detectron2

I did rebuild my detectron2 after using rm -rf build/ **/*.so... However, I get this error:

RuntimeError:
    The detected CUDA version (11.8) mismatches the version that was used to compile
    PyTorch (11.1). Please make sure to use the same CUDA versions.

Apologies, for any inconvenience. Appreciate any help!

CUDA version is wrong, i suggest create a conda env, and install pytorch through pip, eg.
image

CUDA version is wrong, i suggest create a conda env, and install pytorch through pip, eg. image

OK! Will do ^^. Attempting to rebuild the whole project again.

@egmaminta @Hiram1026 Sorry for bothering you, but I downgraded pytorch to 1.9 and rebuild like the instruction but I still have the problem. Could you provide your env

@Hiram1026 Thank you very much! I run the transferring learning example COCO again, and it works for me :)

And I also want to notify you guys, that Pytorch 1.13, which is not suitable, is my previous environment. When I ran RegionClip in this env, it came out with the same problem as @egmaminta

The following are my training curves now.
image

@egmaminta @Hiram1026 Sorry for bothering you, but I downgraded pytorch to 1.9 and rebuild like the instruction but I still have the problem. Could you provide your env

Have you solved? I just solved. create new conda environment, download torch like mentioned before,and userm -rf build/ **/*.so,rebuild again.

@egmaminta @Hiram1026 Sorry for bothering you, but I downgraded pytorch to 1.9 and rebuild like the instruction but I still have the problem. Could you provide your env

Have you solved? I just solved. create new conda environment, download torch like mentioned before,and userm -rf build/ **/*.so,rebuild again.

Thank you for your reply, but I no longer work on it.