ajabri / videowalk

Repository for "Space-Time Correspondence as a Contrastive Random Walk" (NeurIPS 2020)

Home Page:http://ajabri.github.io/videowalk

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Reproducing with pretrained.pth

xvjiarui opened this issue · comments

HI @ajabri

Thanks for sharing the code and model.

However, I am having trouble reproducing your results with the provided pretrained.pth.
It only yields J&F-Mean 0.407953.

Could you please have a check on that?

Thx!

Hi, could you please share the test command you ran? I'll double check on the model.

My DAVIS path is data/davis.

My commands are as followed.

python test.py --filelist eval/davis_vallist.txt \
--model-type scratch --resume pretrained.pth --save-path results \
--topk 10 --videoLen 20 --radius 12  --temperature 0.05  --cropSize -1
python eval/convert_davis.py --in_folder results/ --out_folder converted_results/ --dataset data/davis/

(This one is different from README.md since there is a typo in README.md.)

Can you please provide the output of the first command?

You may download it here.

Sorry, I meant the stdout output.

The last few lines look like this

computing onehot lbl for //data1/home/v-jiarxu/data/davis/DAVIS/Annotations/480p/soapbox/00096_on
ehot.npy
computing resized lbl for //data1/home/v-jiarxu/data/davis/DAVIS/Annotations/480p/soapbox/00096_s
ize60x107.npy
computing onehot lbl for //data1/home/v-jiarxu/data/davis/DAVIS/Annotations/480p/soapbox/00097_on
ehot.npy
computing resized lbl for //data1/home/v-jiarxu/data/davis/DAVIS/Annotations/480p/soapbox/00097_s
ize60x107.npy
computing onehot lbl for //data1/home/v-jiarxu/data/davis/DAVIS/Annotations/480p/soapbox/00098_on
ehot.npy
computing resized lbl for //data1/home/v-jiarxu/data/davis/DAVIS/Annotations/480p/soapbox/00098_s
ize60x107.npy
116.00571203231812 affinity forward, max mem 1191.77197265625
******* Vid 21 TOOK 124.35178780555725 *******
******* Vid 22 (99 frames) *******
computed features 1.167785882949829
computing affinity
192.5184383392334 affinity forward, max mem 1191.77197265625
******* Vid 22 TOOK 203.12105226516724 *******
******* Vid 23 (60 frames) *******
computed features 1.0680644512176514
computing affinity
95.61341738700867 affinity forward, max mem 1191.77197265625
******* Vid 23 TOOK 101.08067631721497 *******
******* Vid 24 (100 frames) *******
computed features 1.9253287315368652
computing affinity
186.5583779811859 affinity forward, max mem 1191.77197265625
******* Vid 24 TOOK 198.6735315322876 *******
******* Vid 25 (120 frames) *******
computed features 6.317120790481567
computing affinity
229.72672510147095 affinity forward, max mem 1191.77197265625
******* Vid 25 TOOK 246.40491771697998 *******
******* Vid 26 (99 frames) *******
computed features 3.164461612701416
computing affinity
185.4557180404663 affinity forward, max mem 1191.77197265625
******* Vid 26 TOOK 201.31444144248962 *******
******* Vid 27 (63 frames) *******
computed features 0.6925504207611084
computing affinity
100.31386852264404 affinity forward, max mem 1191.77197265625
******* Vid 27 TOOK 106.34284019470215 *******
******* Vid 28 (60 frames) *******
computed features 0.8725974559783936
computing affinity
171.42516231536865 affinity forward, max mem 1232.0078125
******* Vid 28 TOOK 180.76536655426025 *******
******* Vid 29 (119 frames) *******
computed features 1.3712513446807861
computing affinity
221.22214937210083 affinity forward, max mem 1232.0078125
******* Vid 29 TOOK 233.60585975646973 *******

Could you provide the first 100 lines, actually?

Using GPU 0
Context Length: 20 Image Size: -1
Arguments Namespace(batchSize=1, cropSize=-1, device='cuda', filelist='eval/davis_vallist.txt', f
inetune=0, gpu_id='0', head_depth=-1, imgSize=-1, long_mem=[0], manualSeed=777, model_type='scrat
ch', no_l2=False, norm_mask=False, pca_vis=False, radius=12.0, remove_layers=['layer4'], resume='
pretrained.pth', round=False, save_path='results', temperature=0.05, texture=False, topk=10, vide
oLen=20, visdom=False, visdom_server='localhost', workers=4)
stride Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
stride Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
stride Conv2d(128, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
stride Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
stride Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
stride Conv2d(256, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
stride Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
stride Conv2d(256, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
stride Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
stride Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
padding Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False, padding_mode
=reflect)
padding Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False, padding_mod
e=reflect)
padding Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False, padding_mod
e=reflect)
padding Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False, padding_mod
e=reflect)
padding Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False, padding_mod
e=reflect)
padding Conv2d(64, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False, padding_mo
de=reflect)
padding Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False, padding_m
ode=reflect)
padding Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False, padding_m
ode=reflect)
padding Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False, padding_m
ode=reflect)
padding Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False, padding_m
ode=reflect)
padding Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False, padding_m
ode=reflect)
padding Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False, padding_m
ode=reflect)
padding Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False, padding_m
ode=reflect)
padding Conv2d(256, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False, padding_m
ode=reflect)
padding Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False, padding_m
ode=reflect)
padding Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False, padding_m
ode=reflect)
padding Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False, padding_m
ode=reflect)
stride Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False, padding_mo
de=reflect)
stride Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False, padding_mo
de=reflect)
stride Conv2d(128, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
stride Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False, padding_mo
de=reflect)
stride Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False, padding_mo
de=reflect)
stride Conv2d(256, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False, padding_mo
de=reflect)
stride Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False, padding_mo
de=reflect)
stride Conv2d(256, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
stride Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False, padding_mo
de=reflect)
stride Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False, padding_mo
de=reflect)
Total params: 2.78M
******* Vid 0 (89 frames) *******

It looks like the model was not loaded. If --resume receives a path that does not exist, it is ignored and no model is reloaded. Maybe you need --resume ../pretrained.pth?

Got you. Thx!

@ajabri So is J&F-Mean 0.407953 the result with a random initialized model? (a useful baseline as well)

I also hit this problem, if python test.py ... command in README was adjusted to use ../pretrained.pth, there would be slightly less confusion

Below full results with this random model:

--------------------------- Global results for val ---------------------------
 J&F-Mean   J-Mean  J-Recall  J-Decay   F-Mean  F-Recall  F-Decay
 0.407962 0.395348  0.415584 0.344236 0.420577  0.411856 0.430973

---------- Per sequence results for val ----------
            Sequence   J-Mean   F-Mean
      bike-packing_1 0.286990 0.459648
      bike-packing_2 0.340866 0.466656
         blackswan_1 0.921515 0.957901
         bmx-trees_1 0.083489 0.308434
         bmx-trees_2 0.490173 0.629036
        breakdance_1 0.320305 0.342849
             camel_1 0.679262 0.655639
    car-roundabout_1 0.722727 0.491933
        car-shadow_1 0.748239 0.675853
              cows_1 0.811198 0.693349
       dance-twirl_1 0.209279 0.245509
               dog_1 0.509048 0.335719
         dogs-jump_1 0.173384 0.256133
         dogs-jump_2 0.159466 0.314179
         dogs-jump_3 0.642678 0.714439
     drift-chicane_1 0.086318 0.106261
    drift-straight_1 0.335125 0.306704
              goat_1 0.496513 0.399590
         gold-fish_1 0.559032 0.524672
         gold-fish_2 0.490060 0.481197
         gold-fish_3 0.696665 0.651392
         gold-fish_4 0.780571 0.810352
         gold-fish_5 0.814907 0.715756
    horsejump-high_1 0.355944 0.402695
    horsejump-high_2 0.452260 0.714426
             india_1 0.512683 0.423960
             india_2 0.459527 0.370960
             india_3 0.477645 0.405829
              judo_1 0.740983 0.744586
              judo_2 0.373953 0.471932
         kite-surf_1 0.260677 0.224472
         kite-surf_2 0.258653 0.337140
         kite-surf_3 0.565658 0.717266
          lab-coat_1 0.000000 0.000000
          lab-coat_2 0.000000 0.000000
          lab-coat_3 0.666406 0.565391
          lab-coat_4 0.645257 0.493571
          lab-coat_5 0.613418 0.545246
             libby_1 0.344237 0.434561
           loading_1 0.662173 0.427835
           loading_2 0.191996 0.314093
           loading_3 0.515426 0.491969
       mbike-trick_1 0.301323 0.474245
       mbike-trick_2 0.318674 0.417186
    motocross-jump_1 0.060510 0.102794
    motocross-jump_2 0.057746 0.091645
paragliding-launch_1 0.714169 0.771939
paragliding-launch_2 0.224110 0.500083
paragliding-launch_3 0.010038 0.060094
           parkour_1 0.078813 0.107582
              pigs_1 0.213912 0.180962
              pigs_2 0.375612 0.507823
              pigs_3 0.716662 0.522685
     scooter-black_1 0.046830 0.136108
     scooter-black_2 0.372132 0.479199
          shooting_1 0.065415 0.265393
          shooting_2 0.419170 0.403543
          shooting_3 0.268001 0.420049
           soapbox_1 0.194220 0.241350
           soapbox_2 0.101134 0.155023
           soapbox_3 0.123026 0.188374

Total time:91.4125120639801

@vadimkantorov Yes, I guess so! Indeed, a useful baseline :)

Sorry for the confusion, will clarify this in the README.