clovaai / BESTIE

[CVPR 2022] Beyond Semantic to Instance Segmentation: Weakly-Supervised Instance Segmentation via Semantic Knowledge Transfer and Self-Refinement

Repository from Github https://github.comclovaai/BESTIERepository from Github https://github.comclovaai/BESTIE

Questions about using the pretrained model (hrnet48)

peter0617ku opened this issue · comments

Dear Authors,

I am trying to use a pre-trained model to generate instance segmentations, but I encounter some problems.
I have downloaded the weights (BESTIE_HRNet48_image_label.pt) and put it in models/imagenet.
Also, I modified the parameters and locations in the shell script (scripts/run_image_labels.sh and models/hrnet_config/w48_384x288_adam_lr1e-3.yaml) as follow.

scripts/run_image_labels.sh:

# Training BESTIE with image-level labels.

ROOT=<My BESTIE Root>/BESTIE/data_root/VOC2012
SUP=cls
PSEUDO_THRESH=0.7
REFINE_THRESH=0.3
REFINE_WARMUP=0
SIZE=416
BATCH=8
WORKERS=0
TRAIN_ITERS=50000
BACKBONE=hrnet48 # [resnet50, resnet101, hrnet32, hrnet48]
VAL_IGNORE=False

CUDA_VISIBLE_DEVICES=0 torchrun --standalone --nnodes=1 --nproc_per_node=1 main.py \
--root_dir ${ROOT} --sup ${SUP} --batch_size ${BATCH} --num_workers ${WORKERS} --crop_size ${SIZE} --train_iter ${TRAIN_ITERS} \
--refine True --refine_iter ${REFINE_WARMUP} --pseudo_thresh ${PSEUDO_THRESH} --refine_thresh ${REFINE_THRESH} \
--val_freq 1000 --val_ignore ${VAL_IGNORE} --val_clean False --val_flip False \
--seg_weight 1.0 --center_weight 200.0 --offset_weight 0.01 \
--lr 5e-5 --backbone ${BACKBONE} --random_seed 3407

w48_384x288_adam_lr1e-3.yaml:

AUTO_RESUME: true
CUDNN:
  BENCHMARK: true
  DETERMINISTIC: false
  ENABLED: true
DATA_DIR: ''
GPUS: (0,1,2,3)
OUTPUT_DIR: 'output'
LOG_DIR: 'log'
WORKERS: 24
PRINT_FREQ: 100

DATASET:
  COLOR_RGB: true
  DATASET: 'voc'
  DATA_FORMAT: jpg
  FLIP: true
  NUM_JOINTS_HALF_BODY: 8
  PROB_HALF_BODY: 0.3
  ROOT: 'data/coco/'
  ROT_FACTOR: 45
  SCALE_FACTOR: 0.35
  TEST_SET: 'val2017'
  TRAIN_SET: 'train2017'
MODEL:
  INIT_WEIGHTS: true
  NAME: pose_hrnet
  NUM_JOINTS: 1
  PRETRAINED: 'models/imagenet/BESTIE_HRNet48_image_label.pt'
  TARGET_TYPE: gaussian
  IMAGE_SIZE:
  - 288
  - 384
  HEATMAP_SIZE:
  - 72
  - 96
  SIGMA: 3
  EXTRA:
    PRETRAINED_LAYERS:
    - 'conv1'
    - 'bn1'
    - 'conv2'
    - 'bn2'
    - 'layer1'
    - 'transition1'
    - 'stage2'
    - 'transition2'
    - 'stage3'
    - 'transition3'
    - 'stage4'
    FINAL_CONV_KERNEL: 1
    STAGE2:
      NUM_MODULES: 1
      NUM_BRANCHES: 2
      BLOCK: BASIC
      NUM_BLOCKS:
      - 4
      - 4
      NUM_CHANNELS:
      - 48
      - 96
      FUSE_METHOD: SUM
    STAGE3:
      NUM_MODULES: 4
      NUM_BRANCHES: 3
      BLOCK: BASIC
      NUM_BLOCKS:
      - 4
      - 4
      - 4
      NUM_CHANNELS:
      - 48
      - 96
      - 192
      FUSE_METHOD: SUM
    STAGE4:
      NUM_MODULES: 3
      NUM_BRANCHES: 4
      BLOCK: BASIC
      NUM_BLOCKS:
      - 4
      - 4
      - 4
      - 4
      NUM_CHANNELS:
      - 48
      - 96
      - 192
      - 384
      FUSE_METHOD: SUM
LOSS:
  USE_TARGET_WEIGHT: true
TRAIN:
  BATCH_SIZE_PER_GPU: 24
  SHUFFLE: true
  BEGIN_EPOCH: 0
  END_EPOCH: 210
  OPTIMIZER: adam
  LR: 0.001
  LR_FACTOR: 0.1
  LR_STEP:
  - 170
  - 200
  WD: 0.0001
  GAMMA1: 0.99
  GAMMA2: 0.0
  MOMENTUM: 0.9
  NESTEROV: false
TEST:
  BATCH_SIZE_PER_GPU: 24
  COCO_BBOX_FILE: 'data/coco/person_detection_results/COCO_val2017_detections_AP_H_56_person.json'
  BBOX_THRE: 1.0
  IMAGE_THRE: 0.0
  IN_VIS_THRE: 0.2
  MODEL_FILE: ''
  NMS_THRE: 1.0
  OKS_THRE: 0.9
  USE_GT_BBOX: true
  FLIP_TEST: true
  POST_PROCESS: true
  SHIFT_HEATMAP: true
DEBUG:
  DEBUG: true
  SAVE_BATCH_IMAGES_GT: true
  SAVE_BATCH_IMAGES_PRED: true
  SAVE_HEATMAPS_GT: true
  SAVE_HEATMAPS_PRED: true

However, I got a bad result from the validation that MAP is very small (map=0.0).
I would like to know which step is wrong to get this result.
Thanks in advance!

$ ./scripts/run_image_labels.sh 
=> loading pretrained model models/imagenet/BESTIE_HRNet48_image_label.pt
number of train set = 10582 | valid set = 1449
...Preparing GT dataset for evaluation
...Training Start 

Namespace(backbone='hrnet48', batch_size=8, beta=3.0, bn_momentum=0.01, center_weight=200.0, crop_size=416, cur_iter=0, dataset='voc', gamma=0.9, gpu=0, kernel=41, local_rank=0, lr=5e-05, num_classes=20, num_workers=0, offset_weight=0.01, print_freq=200, pseudo_thresh=0.7, random_seed=3407, refine=True, refine_iter=0, refine_thresh=0.3, resume=None, root_dir='/home/peterku2/Documents/cubemap/BESTIE/data_root/VOC2012', save_folder='checkpoints/test1', save_freq=10000, seg_weight=1.0, sigma=6, sup='cls', train_epoch=0, train_iter=50000, val_clean=False, val_flip=False, val_freq=1000, val_ignore=False, val_kernel=41, val_thresh=0.1, warm_iter=2000, weight_decay=0, world_size=1)
100%|████████████████████████████████████████████████████████████████| 1449/1449 [01:01<00:00, 23.47it/s]
{'ap': array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
       0., 0., 0.]), 'map': 0.0}

My computer configuration:
Ubuntu 18.04
NVIDIA GeForce RTX 3080
CUDA Version 11.6
RAM 64 GB

Hi all,

I found the problem causing the low MAP.
The reason is that the method of loading the weights has some errors.

Below is my modification to the program.

models/hrnet.py (Line 524):

        if os.path.isfile(pretrained):
            pretrained_state_dict = torch.load(pretrained)
            # torch.load_state_dict(pretrained_state_dict['model'])
            logger.info('=> loading pretrained model {}'.format(pretrained))
            print('=> loading pretrained model {}'.format(pretrained))

            need_init_state_dict = {}
            for name, m in pretrained_state_dict.items():
                need_init_state_dict[name] = m
                # if name.split('.')[0] in self.pretrained_layers \
                #    or self.pretrained_layers[0] is '*':
                #     print(name)
                #     need_init_state_dict[name] = m
            self.load_state_dict(need_init_state_dict['model'], strict=True)

Please let me know if this modification causes problems.
Thanks!