VlSomers / bpbreid

A strong baseline for body part-based person re-identification (check out our WACV23 paper)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Question about reproducing results with resnet backbone

hh23333 opened this issue · comments

Hello, Vladimir. I followed your paper and changed the backbone to resnet and the input size to 256*128. I repeated the experiment twice and attached the results below:
image
image

The rank1 is similar, but the mAP is lower. Could you please tell me what other settings I need to change? Thank you for your help.

Hi @hh23333 , can you share your full configuration?

Hi, Vladimir. Thank you for your prompt reply. My full configuration is shown as follows:

Diff from default config :
{}
Show configuration
adam:
  beta1: 0.9
  beta2: 0.999
cuhk03:
  classic_split: False
  labeled_images: False
  use_metric_cuhk03: False
data:
  cj:
    always_apply: False
    brightness: 0.2
    contrast: 0.15
    hue: 0.0
    p: 0.5
    saturation: 0.0
  combineall: False
  height: 256
  load_train_targets: False
  norm_mean: [0.485, 0.456, 0.406]
  norm_std: [0.229, 0.224, 0.225]
  ro:
    max_overlap: 0.8
    min_overlap: 0.5
    n: 1
    p: 0.5
    path: 
  root: /media/omnisky/Data/hh_datasets/lifelong
  save_dir: /media/omnisky/Data/hh_experiment/bpbreid_experiment/res50_256_0001_re2/812699217
  sources: ['occluded_duke']
  split_id: 0
  targets: ['occluded_duke']
  transforms: ['rc', 're']
  type: image
  width: 128
  workers: 4
inference:
  enabled: False
  input_folder: 
loss:
  name: part_based
  part_based:
    name: part_averaged_triplet_loss
    ppl: cl
    weights:
      conct:
        id: 1.0
        tr: 0.0
      foreg:
        id: 1.0
        tr: 0.0
      globl:
        id: 1.0
        tr: 0.0
      parts:
        id: 0.0
        tr: 1.0
      pixls:
        ce: 0.35
  softmax:
    label_smooth: True
  triplet:
    margin: 0.3
    weight_t: 1.0
    weight_x: 0.0
market1501:
  use_500k_distractors: False
model:
  bpbreid:
    backbone: resnet50
    dim_reduce: after_pooling
    dim_reduce_output: 512
    hrnet_pretrained_path: /media/omnisky/Data/hh_experiment/bpbreid_experiment/pretrain_weight
    last_stride: 1
    learnable_attention_enabled: True
    mask_filtering_testing: True
    mask_filtering_training: False
    masks:
      background_computation_strategy: threshold
      dir: pifpaf_maskrcnn_filtering
      mask_filtering_threshold: 0.5
      parts_names: ['head_mask', 'left_arm_mask', 'right_arm_mask', 'torso_mask', 'left_leg_mask', 'right_leg_mask', 'left_feet_mask', 'right_feet_mask']
      parts_num: 8
      preprocess: eight
      softmax_weight: 15
      type: disk
    normalization: identity
    pooling: gwap
    shared_parts_id_classifier: False
    test_embeddings: ['bn_foreg', 'parts']
    test_use_target_segmentation: none
    testing_binary_visibility_score: True
    training_binary_visibility_score: True
  load_config: False
  load_weights: 
  name: bpbreid
  pretrained: True
  resume: 
  save_model_flag: True
  vit:
    depth: base
    drop_path_ratio: 0.1
    drop_ratio: 0.0
    pretrain_path: 
    sie_xishu: 3.0
    size_train: (256, 128)
    stride_size: (16, 16)
    with_cam: False
project:
  config_file: bpbreid_occ_duke_train.yaml
  debug_mode: False
  diff_config: {}
  experiment_id: 8c35045d-44be-4143-907b-0e5d8684b52f
  experiment_name: 
  job_id: 812699217
  logger:
    matplotlib_show: False
    save_disk: True
    use_clearml: False
    use_neptune: False
    use_tensorboard: True
    use_wandb: False
  name: BPBreID
  notes: 
  start_time: 2023_04_26_10_13_11_13S
  tags: []
rmsprop:
  alpha: 0.99
sampler:
  num_instances: 4
  train_sampler: RandomIdentitySampler
  train_sampler_t: RandomIdentitySampler
sgd:
  dampening: 0.0
  momentum: 0.9
  nesterov: False
test:
  batch_size: 64
  batch_size_pairwise_dist_matrix: 500
  detailed_ranking: True
  dist_metric: euclidean
  evaluate: False
  normalize_feature: True
  part_based:
    dist_combine_strat: mean
  ranks: [1, 5, 10, 20]
  rerank: False
  save_features: False
  start_eval: 0
  vis_embedding_projection: False
  vis_feature_maps: False
  visrank: True
  visrank_count: 10
  visrank_per_body_part: False
  visrank_q_idx_list: [0, 1, 2, 3, 4, 5]
  visrank_topk: 10
train:
  base_lr_mult: 0.1
  batch_debug_freq: 0
  batch_log_freq: 0
  batch_size: 64
  eval_freq: 20
  fixbase_epoch: 0
  gamma: 0.1
  lr: 0.00035
  lr_scheduler: warmup_multi_step
  max_epoch: 120
  new_layers: ['classifier']
  open_layers: ['classifier']
  optim: adam
  seed: 1
  staged_lr: False
  start_epoch: 0
  stepsize: [40, 70]
  weight_decay: 0.0005
use_gpu: True
video:
  pooling_method: avg
  sample_method: evenly
  seq_len: 15

Collecting env info ...
** System info **
PyTorch version: 1.12.1
Is debug build: False
CUDA used to build PyTorch: 10.2
ROCM used to build PyTorch: N/A

OS: Ubuntu 18.04.5 LTS (x86_64)
GCC version: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
Clang version: Could not collect
CMake version: Could not collect
Libc version: glibc-2.27

Python version: 3.10.8 (main, Nov 24 2022, 14:13:03) [GCC 11.2.0] (64-bit runtime)
Python platform: Linux-5.4.0-146-generic-x86_64-with-glibc2.27
Is CUDA available: True
CUDA runtime version: Could not collect
GPU models and configuration: 
GPU 0: NVIDIA GeForce RTX 2080 Ti
GPU 1: NVIDIA GeForce RTX 2080 Ti
GPU 2: NVIDIA GeForce RTX 2080 Ti
GPU 3: NVIDIA GeForce RTX 2080 Ti
GPU 4: NVIDIA GeForce RTX 2080 Ti
GPU 5: NVIDIA GeForce RTX 2080 Ti
GPU 6: NVIDIA GeForce RTX 2080 Ti
GPU 7: NVIDIA GeForce RTX 2080 Ti

Nvidia driver version: 470.182.03
cuDNN version: Probably one of the following:
/usr/lib/x86_64-linux-gnu/libcudnn.so.8.2.1
/usr/lib/x86_64-linux-gnu/libcudnn_adv_infer.so.8.2.1
/usr/lib/x86_64-linux-gnu/libcudnn_adv_train.so.8.2.1
/usr/lib/x86_64-linux-gnu/libcudnn_cnn_infer.so.8.2.1
/usr/lib/x86_64-linux-gnu/libcudnn_cnn_train.so.8.2.1
/usr/lib/x86_64-linux-gnu/libcudnn_ops_infer.so.8.2.1
/usr/lib/x86_64-linux-gnu/libcudnn_ops_train.so.8.2.1
/usr/local/cuda-11.1/targets/x86_64-linux/lib/libcudnn.so.8
/usr/local/cuda-11.1/targets/x86_64-linux/lib/libcudnn_adv_infer.so.8
/usr/local/cuda-11.1/targets/x86_64-linux/lib/libcudnn_adv_train.so.8
/usr/local/cuda-11.1/targets/x86_64-linux/lib/libcudnn_cnn_infer.so.8
/usr/local/cuda-11.1/targets/x86_64-linux/lib/libcudnn_cnn_train.so.8
/usr/local/cuda-11.1/targets/x86_64-linux/lib/libcudnn_ops_infer.so.8
/usr/local/cuda-11.1/targets/x86_64-linux/lib/libcudnn_ops_train.so.8
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

Versions of relevant libraries:
[pip3] numpy==1.23.5
[pip3] torch==1.12.1
[pip3] torchaudio==0.12.1
[pip3] torchmetrics==0.10.3
[pip3] torchreid==1.2.3
[pip3] torchvision==0.13.1
[conda] blas                      1.0                         mkl  
[conda] cudatoolkit               10.2.89              hfd86e86_1  
[conda] mkl                       2021.4.0           h06a4308_640  
[conda] mkl-fft                   1.3.1                    pypi_0    pypi
[conda] mkl-random                1.2.2                    pypi_0    pypi
[conda] mkl-service               2.4.0                    pypi_0    pypi
[conda] mkl_fft                   1.3.1           py310hd6ae3a3_0  
[conda] mkl_random                1.2.2           py310h00e6091_0  
[conda] numpy                     1.23.5                   pypi_0    pypi
[conda] numpy-base                1.23.4          py310h8e6c178_0  
[conda] pytorch                   1.12.1          py3.10_cuda10.2_cudnn7.6.5_0    pytorch
[conda] pytorch-mutex             1.0                        cuda    pytorch
[conda] torch                     1.12.1                   pypi_0    pypi
[conda] torchaudio                0.12.1                   pypi_0    pypi
[conda] torchmetrics              0.10.3                   pypi_0    pypi
[conda] torchreid                 1.2.3                     dev_0    <develop>
[conda] torchvision               0.13.1                   pypi_0    pypi
        Pillow (9.2.0)

Building train transforms ...
+ resize to 256x128
+ random crop
+ normalization (mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
+ random erase
+ to torch tensor of range [0, 1]
Building test transforms ...
+ resize to 256x128
+ to torch tensor of range [0, 1]
+ normalization (mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
+ masks preprocess = eight
+ use add background mask
=> Loading train (source) dataset
Creating new dataset occluded_duke and add it to the datasets cache.
=> Loaded OccludedDuke
  ----------------------------------------
  subset   | # ids | # images | # cameras
  ----------------------------------------
  train    |   702 |    15618 |         8
  query    |   519 |     2210 |         8
  gallery  |  1110 |    17661 |         8
  ----------------------------------------
=> Loading test (target) dataset
Using cached dataset occluded_duke.
Using cached dataset occluded_duke.


  **************** Summary ****************
  source            : ['occluded_duke']
  # source datasets : 1
  # source ids      : 702
  # source images   : 15618
  # source cameras  : 8
  target            : ['occluded_duke']
  *****************************************

Hi @hh23333, sorry for the delayed answer, I was looking into my old Wandb experiments to find which hyperparameters changed, here are some stuff you can try: change 'bn_foreg' for 'foreg', try random flip ('rf') in the transforms, use a bigger output feature size (dim_reduce_output: 1024) or disable the dim_reduce layer (dim_reduce: 'none'). Let me know If you manage to make this work!

Hi, Vladimir. Thank you for your reply. I will try to implement your suggestions and see if they can improve the situation. If I have any results, I will let you know.

Hi, @VlSomers , I have conducted the experiment following your suggestions and obtained the following results(I conducted each experiment twice):
image
It seems disabling the dim_reduce layer works well, but the mAP is still lower than the results in paper, have you reran the experiments with the uploaded code? Is it possible that the difference in the results is caused by the difference in the experimental equipment?

Hi @hh23333 , this is strange you have a much better rank-1 than what is reported in the paper while still having 1% less mAP. The experiments with ResNet-50 comes from an older version of the code, before a big refactoring, so there might be some changes in the implementation: I looked at the config used at that time to see if there's any difference with your config, but it's hard to be sure of it because some parameters have new names. Have you tried "- dim reduce + rf + foreg - bn_foreg" all at once? When you say '+ rf' did you just use 'rf' in the transforms or have you added it to the other transforms? You should use: "["rc","rf","re"]". Can you also try using "dist_metric: cosine" with all previous params?

Hi @VlSomers , '+rf' means "["rc","rf","re"]", the experiment resullts with full config ("- dim reduce + rf + foreg - bn_foreg + cosine") is :
image

It seems that adding rf will slightly degrade the results, using foreg instead of bn_foreg will greatly reduce the accuracy.

@hh23333 Sorry to interupt you, but I met an issue:
ValueError: Height and Width of image, mask or masks should be equal. You can disable shapes check by setting a parameter is_check_shapes=False of Compose class (do it only if you are sure about your data consistency).

The pretrained model I used is "bpbreid_occluded_duke_hrnet32_10670.pth". Actually I don't know the size of the masks this project provided. I don't generate masks by myself. I wonder if you have met this issue before.

@TInaWangxue, set is_check_shapes=False works well for me.