accuracy 0
mazatov opened this issue · comments
Hey @VlSomers , might be too late but decided to try my hand at this challenge as well while there is some time. I'm justing testing out your benchline code and in the outputs I get accuracy always 0, which doesn't seem right haha. Do you have any idea what I might be doing wrong with the benchmark code?
ubuntu@ip-10-0-0-13:~/soccernet/sn-reid$ python benchmarks/baseline/main.py --config-file benchmarks/baseline/configs/baseline_config.yaml
Show configuration
adam:
beta1: 0.9
beta2: 0.999
cuhk03:
classic_split: False
labeled_images: False
data:
combineall: False
eval_metric: soccernetv3
height: 256
k_tfm: 1
load_train_targets: False
norm_mean: [0.485, 0.456, 0.406]
norm_std: [0.229, 0.224, 0.225]
root: datasets
save_dir: log
sources: ['soccernetv3']
split_id: 0
targets: ['soccernetv3', 'soccernetv3_test', 'soccernetv3_challenge']
transforms: ['random_flip']
type: image
width: 128
workers: 4
loss:
name: triplet
softmax:
label_smooth: True
triplet:
margin: 0.3
weight_t: 0.5
weight_x: 0.5
market1501:
use_500k_distractors: False
model:
load_weights:
name: resnet50_fc512
pretrained: True
resume:
rmsprop:
alpha: 0.99
sampler:
num_cams: 1
num_datasets: 1
num_instances: 4
train_sampler: RandomIdentitySampler
train_sampler_t: RandomIdentitySampler
sgd:
dampening: 0.0
momentum: 0.9
nesterov: False
soccernetv3:
training_subset: 0.1
test:
batch_size: 100
dist_metric: euclidean
eval_freq: -1
evaluate: False
export_ranking_results: True
normalize_feature: False
ranks: [1]
rerank: False
start_eval: 0
visrank: False
visrank_topk: 10
train:
base_lr_mult: 0.1
batch_size: 128
fixbase_epoch: 0
gamma: 0.1
lr: 0.0003
lr_scheduler: single_step
max_epoch: 40
new_layers: ['classifier']
open_layers: ['classifier']
optim: adam
print_freq: 1
seed: 1
staged_lr: False
start_epoch: 0
stepsize: [20]
weight_decay: 0.0005
use_gpu: True
video:
pooling_method: avg
sample_method: evenly
seq_len: 15
Collecting env info ...
** System info **
PyTorch version: 1.11.0
Is debug build: False
CUDA used to build PyTorch: 10.2
ROCM used to build PyTorch: N/A
OS: Ubuntu 16.04.7 LTS (x86_64)
GCC version: (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609
Clang version: Could not collect
CMake version: version 3.18.2
Libc version: glibc-2.10
Python version: 3.7.12 | packaged by conda-forge | (default, Oct 26 2021, 06:08:21) [GCC 9.4.0] (64-bit runtime)
Python platform: Linux-4.4.0-1128-aws-x86_64-with-debian-stretch-sid
Is CUDA available: True
CUDA runtime version: 10.0.130
GPU models and configuration: GPU 0: Tesla T4
Nvidia driver version: 450.80.02
cuDNN version: Probably one of the following:
/usr/local/cuda-10.1/targets/x86_64-linux/lib/libcudnn.so.7.6.5
/usr/local/cuda-10.2/targets/x86_64-linux/lib/libcudnn.so.7.6.5
/usr/local/cuda-11.0/targets/x86_64-linux/lib/libcudnn.so.8.0.2
/usr/local/cuda-11.0/targets/x86_64-linux/lib/libcudnn_adv_infer.so.8.0.2
/usr/local/cuda-11.0/targets/x86_64-linux/lib/libcudnn_adv_train.so.8.0.2
/usr/local/cuda-11.0/targets/x86_64-linux/lib/libcudnn_cnn_infer.so.8.0.2
/usr/local/cuda-11.0/targets/x86_64-linux/lib/libcudnn_cnn_train.so.8.0.2
/usr/local/cuda-11.0/targets/x86_64-linux/lib/libcudnn_ops_infer.so.8.0.2
/usr/local/cuda-11.0/targets/x86_64-linux/lib/libcudnn_ops_train.so.8.0.2
HIP runtime version: N/A
MIOpen runtime version: N/A
Versions of relevant libraries:
[pip3] numpy==1.21.5
[pip3] torch==1.11.0
[pip3] torchreid==1.4.0
[pip3] torchvision==0.12.0
[conda] blas 1.0 mkl
[conda] cudatoolkit 10.2.89 h8f6ccaa_10 conda-forge
[conda] ffmpeg 4.3 hf484d3e_0 pytorch
[conda] mkl 2021.4.0 h06a4308_640
[conda] mkl-service 2.4.0 py37h402132d_0 conda-forge
[conda] mkl_fft 1.3.1 py37h3e078e5_1 conda-forge
[conda] mkl_random 1.2.2 py37h219a48f_0 conda-forge
[conda] numpy 1.21.5 pypi_0 pypi
[conda] numpy-base 1.21.2 py37h79a1101_0
[conda] pytorch 1.11.0 py3.7_cuda10.2_cudnn7.6.5_0 pytorch
[conda] pytorch-mutex 1.0 cuda pytorch
[conda] torchreid 1.4.0 dev_0 <develop>
[conda] torchvision 0.12.0 py37_cu102 pytorch
Pillow (9.0.1)
Building train transforms ...
+ resize to 256x128
+ random flip
+ to torch tensor of range [0, 1]
+ normalization (mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
Building test transforms ...
+ resize to 256x128
+ to torch tensor of range [0, 1]
+ normalization (mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
=> Loading train (source) dataset
SoccerNet valid set was already downloaded and unzipped at /home/ubuntu/soccernet/sn-reid/datasets/soccernetv3/reid/valid.
SoccerNet train set was already downloaded and unzipped at /home/ubuntu/soccernet/sn-reid/datasets/soccernetv3/reid/train.
=> Loaded Soccernetv3
----------------------------------------
subset | # ids | # images | # cameras
----------------------------------------
train | 15443 | 24872 | 919
query | 11638 | 11638 | 1751
gallery | 29534 | 34355 | 1751
----------------------------------------
=> Loading test (target) dataset
SoccerNet valid set was already downloaded and unzipped at /home/ubuntu/soccernet/sn-reid/datasets/soccernetv3/reid/valid.
SoccerNet train set was already downloaded and unzipped at /home/ubuntu/soccernet/sn-reid/datasets/soccernetv3/reid/train.
=> Loaded Soccernetv3
----------------------------------------
subset | # ids | # images | # cameras
----------------------------------------
train | 15443 | 24872 | 919
query | 11638 | 11638 | 1751
gallery | 29534 | 34355 | 1751
----------------------------------------
SoccerNet valid set was already downloaded and unzipped at /home/ubuntu/soccernet/sn-reid/datasets/soccernetv3/reid/valid.
SoccerNet train set was already downloaded and unzipped at /home/ubuntu/soccernet/sn-reid/datasets/soccernetv3/reid/train.
SoccerNet test set was already downloaded and unzipped at /home/ubuntu/soccernet/sn-reid/datasets/soccernetv3/reid/test.
=> Loaded Soccernetv3Test
----------------------------------------
subset | # ids | # images | # cameras
----------------------------------------
train | 0 | 0 | 0
query | 11777 | 11777 | 1715
gallery | 30059 | 34989 | 1715
----------------------------------------
SoccerNet test set was already downloaded and unzipped at /home/ubuntu/soccernet/sn-reid/datasets/soccernetv3/reid/test.
SoccerNet challenge set was already downloaded and unzipped at /home/ubuntu/soccernet/sn-reid/datasets/soccernetv3/reid/challenge.
=> Loaded Soccernetv3Challenge
----------------------------------------
subset | # ids | # images | # cameras
----------------------------------------
train | 0 | 0 | 0
query | 9021 | 9021 | 1310
gallery | 26082 | 26082 | 1310
----------------------------------------
SoccerNet challenge set was already downloaded and unzipped at /home/ubuntu/soccernet/sn-reid/datasets/soccernetv3/reid/challenge.
**************** Summary ****************
source : ['soccernetv3']
# source datasets : 1
# source ids : 15443
# source images : 24872
# source cameras : 919
target : ['soccernetv3', 'soccernetv3_test', 'soccernetv3_challenge']
*****************************************
Building model: resnet50_fc512
Downloading: "https://download.pytorch.org/models/resnet50-19c8e357.pth" to /home/ubuntu/.cache/torch/hub/checkpoints/resnet50-19c8e357.pth
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 97.8M/97.8M [00:00<00:00, 288MB/s]
Model complexity: params=24,558,144 flops=4,054,319,616
Building triplet-engine for image-reid
=> Start training
epoch: [1/40][1/482] time 8.080 (8.080) data 1.883 (1.883) eta 1 day, 19:16:17 loss_t 1.1639 (1.1639) loss_x 9.6623 (9.6623) acc 0.0000 (0.0000) lr 0.000300
epoch: [1/40][2/482] time 0.970 (4.525) data 0.000 (0.942) eta 1 day, 0:13:55 loss_t 0.8328 (0.9983) loss_x 9.6807 (9.6715) acc 0.0000 (0.0000) lr 0.000300
epoch: [1/40][3/482] time 0.961 (3.337) data 0.000 (0.628) eta 17:52:12 loss_t 0.8928 (0.9631) loss_x 9.6797 (9.6742) acc 0.0000 (0.0000) lr 0.000300
epoch: [1/40][4/482] time 0.968 (2.745) data 0.000 (0.471) eta 14:41:49 loss_t 1.7378 (1.1568) loss_x 9.6939 (9.6792) acc 0.0000 (0.0000) lr 0.000300
epoch: [1/40][5/482] time 0.979 (2.392) data 0.000 (0.377) eta 12:48:20 loss_t 1.1981 (1.1650) loss_x 9.6429 (9.6719) acc 0.0000 (0.0000) lr 0.000300
epoch: [1/40][6/482] time 0.977 (2.156) data 0.000 (0.314) eta 11:32:35 loss_t 0.6856 (1.0851) loss_x 9.6849 (9.6741) acc 0.0000 (0.0000) lr 0.000300
epoch: [1/40][7/482] time 0.969 (1.986) data 0.000 (0.269) eta 10:38:05 loss_t 1.2633 (1.1106) loss_x 9.6688 (9.6733) acc 0.0000 (0.0000) lr 0.000300
epoch: [1/40][8/482] time 0.975 (1.860) data 0.000 (0.236) eta 9:57:27 loss_t 1.0468 (1.1026) loss_x 9.6788 (9.6740) acc 0.0000 (0.0000) lr 0.000300
epoch: [1/40][9/482] time 0.974 (1.762) data 0.000 (0.210) eta 9:25:48 loss_t 1.2509 (1.1191) loss_x 9.6873 (9.6755) acc 0.0000 (0.0000) lr 0.000300
epoch: [1/40][10/482] time 0.980 (1.683) data 0.000 (0.189) eta 9:00:39 loss_t 0.4579 (1.0530) loss_x 9.6805 (9.6760) acc 0.0000 (0.0000) lr 0.000300
epoch: [1/40][11/482] time 0.968 (1.618) data 0.000 (0.172) eta 8:39:45 loss_t 0.9808 (1.0464) loss_x 9.6781 (9.6762) acc 0.0000 (0.0000) lr 0.000300
epoch: [1/40][12/482] time 0.976 (1.565) data 0.000 (0.157) eta 8:22:32 loss_t 1.1617 (1.0560) loss_x 9.7125 (9.6792) acc 0.0000 (0.0000) lr 0.000300
epoch: [1/40][13/482] time 0.973 (1.519) data 0.000 (0.145) eta 8:07:54 loss_t 0.9496 (1.0478) loss_x 9.6673 (9.6783) acc 0.0000 (0.0000) lr 0.000300
epoch: [1/40][14/482] time 0.967 (1.480) data 0.001 (0.135) eta 7:55:12 loss_t 1.1300 (1.0537) loss_x 9.6748 (9.6780) acc 0.0000 (0.0000) lr 0.000300
epoch: [1/40][15/482] time 0.978 (1.447) data 0.000 (0.126) eta 7:44:27 loss_t 1.3853 (1.0758) loss_x 9.6532 (9.6764) acc 0.0000 (0.0000) lr 0.000300
epoch: [1/40][16/482] time 0.978 (1.417) data 0.000 (0.118) eta 7:35:02 loss_t 0.8940 (1.0644) loss_x 9.6877 (9.6771) acc 0.0000 (0.0000) lr 0.000300
epoch: [1/40][17/482] time 0.966 (1.391) data 0.000 (0.111) eta 7:26:29 loss_t 1.0139 (1.0615) loss_x 9.6823 (9.6774) acc 0.0000 (0.0000) lr 0.000300
epoch: [1/40][18/482] time 0.968 (1.367) data 0.000 (0.105) eta 7:18:55 loss_t 1.1600 (1.0669) loss_x 9.6901 (9.6781) acc 0.0000 (0.0000) lr 0.000300
epoch: [1/40][19/482] time 0.979 (1.347) data 0.000 (0.099) eta 7:12:21 loss_t 0.9251 (1.0595) loss_x 9.6755 (9.6780) acc 0.0000 (0.0000) lr 0.000300
epoch: [1/40][20/482] time 0.976 (1.328) data 0.000 (0.095) eta 7:06:22 loss_t 1.2995 (1.0715) loss_x 9.6888 (9.6785) acc 0.0000 (0.0000) lr 0.000300
epoch: [1/40][21/482] time 0.981 (1.312) data 0.000 (0.090) eta 7:01:02 loss_t 0.9801 (1.0671) loss_x 9.6916 (9.6791) acc 0.0000 (0.0000) lr 0.000300
epoch: [1/40][22/482] time 0.986 (1.297) data 0.000 (0.086) eta 6:56:15 loss_t 0.5528 (1.0437) loss_x 9.6771 (9.6790) acc 0.0000 (0.0000) lr 0.000300
epoch: [1/40][23/482] time 0.985 (1.283) data 0.000 (0.082) eta 6:51:52 loss_t 0.9734 (1.0407) loss_x 9.6417 (9.6774) acc 0.0000 (0.0000) lr 0.000300
epoch: [1/40][24/482] time 0.999 (1.271) data 0.000 (0.079) eta 6:48:03 loss_t 0.7406 (1.0282) loss_x 9.6794 (9.6775) acc 0.0000 (0.0000) lr 0.000300
epoch: [1/40][25/482] time 0.991 (1.260) data 0.000 (0.076) eta 6:44:25 loss_t 1.0094 (1.0274) loss_x 9.6800 (9.6776) acc 0.0000 (0.0000) lr 0.000300
epoch: [1/40][26/482] time 0.980 (1.249) data 0.000 (0.073) eta 6:40:57 loss_t 0.6596 (1.0133) loss_x 9.6811 (9.6777) acc 0.0000 (0.0000) lr 0.000300
epoch: [1/40][27/482] time 0.986 (1.240) data 0.000 (0.070) eta 6:37:48 loss_t 0.6593 (1.0002) loss_x 9.6837 (9.6779) acc 0.0000 (0.0000) lr 0.000300
epoch: [1/40][28/482] time 0.980 (1.230) data 0.000 (0.068) eta 6:34:48 loss_t 0.3822 (0.9781) loss_x 9.6881 (9.6783) acc 0.0000 (0.0000) lr 0.000300
epoch: [1/40][29/482] time 0.977 (1.222) data 0.000 (0.065) eta 6:31:59 loss_t 0.8385 (0.9733) loss_x 9.6898 (9.6787) acc 0.0000 (0.0000) lr 0.000300
epoch: [1/40][30/482] time 0.983 (1.214) data 0.000 (0.063) eta 6:29:24 loss_t 0.4484 (0.9558) loss_x 9.6916 (9.6791) acc 0.0000 (0.0000) lr 0.000300
epoch: [1/40][31/482] time 0.980 (1.206) data 0.000 (0.061) eta 6:26:58 loss_t 0.5138 (0.9415) loss_x 9.7219 (9.6805) acc 0.0000 (0.0000) lr 0.000300
epoch: [1/40][32/482] time 0.996 (1.200) data 0.000 (0.059) eta 6:24:50 loss_t 0.5446 (0.9291) loss_x 9.7266 (9.6820) acc 0.0000 (0.0000) lr 0.000300
epoch: [1/40][33/482] time 0.989 (1.193) data 0.000 (0.057) eta 6:22:46 loss_t 0.5246 (0.9169) loss_x 9.7235 (9.6832) acc 0.0000 (0.0000) lr 0.000300
epoch: [1/40][34/482] time 0.985 (1.187) data 0.000 (0.056) eta 6:20:47 loss_t 1.7380 (0.9410) loss_x 9.7431 (9.6850) acc 0.0000 (0.0000) lr 0.000300
epoch: [1/40][35/482] time 0.975 (1.181) data 0.000 (0.054) eta 6:18:49 loss_t 1.1075 (0.9458) loss_x 9.7318 (9.6863) acc 0.0000 (0.0000) lr 0.000300
Hi Mike, I see you trained for a very few steps, the accuracy will start increasing much later during the training process. You could try setting the soccernetv3.training_subset config to a smaller value (0.01 for example) for a much shorter training time and to see what happens.
Ah yeah, got it. Thanks. I was assuming it was starting form pretrained Market weights and was surprised to see 0. Training from scratch makes sense it would start with 0.
@VlSomers follow up question :)
On a small subset of 1% I got a high accuracy. However, when I train on 50% of the dataset I still get pretty low accuracy while the Map score is high. Do you know how this accuracy is calculated? Seems so strange we'd have a decent map score with such low accuracy.
Below are the last few lines of the calculations.
epoch: [50/50][1276/1286] time 0.972 (0.986) data 0.000 (0.018) eta 0:00:09 loss_t 0.0531 (0.0428) loss_x 11.2660 (11.2688) acc 0.0000 (0.1678) lr 0.000003
epoch: [50/50][1277/1286] time 0.977 (0.986) data 0.000 (0.018) eta 0:00:08 loss_t 0.0805 (0.0428) loss_x 11.2628 (11.2688) acc 3.1250 (0.1701) lr 0.000003
epoch: [50/50][1278/1286] time 0.970 (0.986) data 0.000 (0.018) eta 0:00:07 loss_t 0.0501 (0.0428) loss_x 11.2664 (11.2688) acc 0.7812 (0.1706) lr 0.000003
epoch: [50/50][1279/1286] time 0.964 (0.986) data 0.006 (0.018) eta 0:00:06 loss_t 0.0356 (0.0428) loss_x 11.2624 (11.2688) acc 2.3438 (0.1723) lr 0.000003
epoch: [50/50][1280/1286] time 0.965 (0.986) data 0.000 (0.018) eta 0:00:05 loss_t 0.0842 (0.0429) loss_x 11.2630 (11.2688) acc 2.3438 (0.1740) lr 0.000003
epoch: [50/50][1281/1286] time 0.964 (0.986) data 0.000 (0.018) eta 0:00:04 loss_t 0.0539 (0.0429) loss_x 11.2661 (11.2688) acc 0.0000 (0.1738) lr 0.000003
epoch: [50/50][1282/1286] time 0.962 (0.986) data 0.000 (0.018) eta 0:00:03 loss_t 0.0410 (0.0429) loss_x 11.2639 (11.2688) acc 0.0000 (0.1737) lr 0.000003
epoch: [50/50][1283/1286] time 0.963 (0.986) data 0.000 (0.018) eta 0:00:02 loss_t 0.0328 (0.0429) loss_x 11.2647 (11.2688) acc 0.7812 (0.1742) lr 0.000003
epoch: [50/50][1284/1286] time 0.963 (0.986) data 0.000 (0.018) eta 0:00:01 loss_t 0.0755 (0.0429) loss_x 11.2599 (11.2688) acc 3.1250 (0.1765) lr 0.000003
epoch: [50/50][1285/1286] time 0.965 (0.986) data 0.000 (0.018) eta 0:00:00 loss_t 0.0721 (0.0429) loss_x 11.2589 (11.2688) acc 0.7812 (0.1769) lr 0.000003
epoch: [50/50][1286/1286] time 0.963 (0.986) data 0.000 (0.018) eta 0:00:00 loss_t 0.0410 (0.0429) loss_x 11.2601 (11.2688) acc 0.0000 (0.1768) lr 0.000003
=> Final test
##### Evaluating soccernetv3 (source) #####
Extracting features from query set ...
Done, obtained 11638-by-512 matrix
Extracting features from gallery set ...
Done, obtained 34355-by-512 matrix
Speed: 0.0166 sec/batch
Computing distance matrix with metric=euclidean ...
Exporting ranking results to 'log/ranking_results_soccernetv3_2022-05-19_14_22_54_648.json' for external evaluation...
Computing CMC and mAP ...
** Results **
mAP: 63.6%
CMC curve
Rank-1 : 52.4%
Hi Mazatov, it seems indeed strange, as a comparison, when using a soccernetv3.training_subset of 0.02, I start getting 100% accuracy after +/- 20 epochs. The only explanation I see is the "infeasibility" of the classification task when the training dataset becomes too big: if you train on 50% of the dataset, you'll end up with so much training identities (and multiple identities for the same player, as explained in the README and the video tutorial), that classifying a sample into its unique correct identity becomes infeasible. However, the network still learns something (the triplet loss plays an important role for that) and you still get nice final ranking performance.
Hi @VlSomers , yeah I get similar results on 0.02. That's an interesting thought about lots of idetities. Overall the trained model still performs well on the test dataset so the triplet loss is working. I was assuming the accuracy just measures if it's splitting classes well within the batch, or within the triplet, so it never actually compares all the identities. Do you know the accuracy is calculated here?