ajabri / videowalk

Repository for "Space-Time Correspondence as a Contrastive Random Walk" (NeurIPS 2020)

Home Page:http://ajabri.github.io/videowalk

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

evaluation on the VIP and JHMDB datasets

AndyTang15 opened this issue · comments

Hi Allan,

Happy new year! And many thanks for releasing the code of this great work!

I used the codebase and the pretrained model provided in the repo to evaluate the VIP and JHMDB datasets, the results are:
VIP: 37.12(mIOU), JHMDB: 57.62(PCK@0.1) and 79.59(PCK@0.2).

They are noticeably lower than the results in your paper:
VIP: 38.6(mIOU), JHMDB: 59.3(PCK@0.1) and 84.9(PCK@0.2).

Could you please help to check whether I evaluated them in a right way?
For VIP, I used the command:
python test.py --filelist eval/VIP_vallist.txt --model-type scratch --resume ../pretrained.pth --save-path vip_results --topk 10 --videoLen 4 --radius 12 --temperature 0.05 --cropSize 560

For JHMDB, I used the command:
python test.py --filelist eval/jhmdb_vallist.txt --model-type scratch --resume ../pretrained.pth --save-path jhmdb_results --topk 10 --videoLen 7 --radius 12 --temperature 0.05 --cropSize 320

The hyperparameters above were selected based on your paper except temperature (I've also tried 0.07 but found 0.05 is better).

BTW, there're two bugs for JHMDB evaluation:

  1. https://github.com/ajabri/videowalk/blob/master/code/data/jhmdb.py#L231
    the "sio" should be imported in this python file

  2. https://github.com/ajabri/videowalk/blob/master/code/test.py#L161
    it should be "test_utils" rather than "utils"

Hi @AndyTang15,

Thanks for your interest, and I apologize for the late reply.

I haven't re-run the JHMDB and VIP evaluations since refactoring and retraining models for the code release, so thanks for bringing this to my attention, and I will take a closer look!

One detail that will improve the JHMDB result is that the radius should be (commensurately) decreased, since the input is about 4x smaller (320x320 v.s. 900x480). So, you might consider a radius of 5 instead of 12. I apologize for the confusion (and the typo in the appendix).

python test.py --filelist eval/jhmdb_vallist.txt --model-type scratch \
--resume ../pretrained.pth --save-path jhmdb_results \
--topk 10 --videoLen 7 --radius 5 --temperature 0.05 --cropSize 320

@ajabri Hi Allan,
Many thanks for your reply and help. I've tried radius=5 following your command, as well as radius=3. The results on JHMDB are:

radius=5: PCK@0.1 58.64, PCK@0.2 80.54
radius=3: PCK@0.1 58.84, PCK@0.2 80.23

The performances have all been improved but still lower than the results reported in your paper before refactoring, especially on PCK@0.2, would it be possible for you to help me with this again? Many thanks!

Hi @ajabri and @AndyTang15,
Just wanted to check on whether you were able to reconcile the performance with results in the paper? I ran into the same issue with JHMDB performance, and I was not able to reproduce results with various radius settings.

Thanks!

Hi @dmckee5,

I have not yet reconciled this issue (the lower PCK@0.2 with this repository). If you are reporting or comparing to our results, at this point, please go ahead and report the result you've reproduced. I am hoping to get to this soon.

Hi @ajabri @AndyTang15 ,
May I ask where did you download the VIP dataset? The official link in the original paper is expired. Is there any cloud drive version?

Hi Allan,

Happy new year! And many thanks for releasing the code of this great work!

I used the codebase and the pretrained model provided in the repo to evaluate the VIP and JHMDB datasets, the results are: VIP: 37.12(mIOU), JHMDB: 57.62(PCK@0.1) and 79.59(PCK@0.2).

They are noticeably lower than the results in your paper: VIP: 38.6(mIOU), JHMDB: 59.3(PCK@0.1) and 84.9(PCK@0.2).

Could you please help to check whether I evaluated them in a right way? For VIP, I used the command: python test.py --filelist eval/VIP_vallist.txt --model-type scratch --resume ../pretrained.pth --save-path vip_results --topk 10 --videoLen 4 --radius 12 --temperature 0.05 --cropSize 560

For JHMDB, I used the command: python test.py --filelist eval/jhmdb_vallist.txt --model-type scratch --resume ../pretrained.pth --save-path jhmdb_results --topk 10 --videoLen 7 --radius 12 --temperature 0.05 --cropSize 320

The hyperparameters above were selected based on your paper except temperature (I've also tried 0.07 but found 0.05 is better).

BTW, there're two bugs for JHMDB evaluation:

  1. https://github.com/ajabri/videowalk/blob/master/code/data/jhmdb.py#L231
    the "sio" should be imported in this python file
  2. https://github.com/ajabri/videowalk/blob/master/code/test.py#L161
    it should be "test_utils" rather than "utils"

How did you get VIP_vallist.txt and jhmdb_vallist.txt ?

Hi @AndyTang15, I used the same commands as you but looks like my results are much worse than urs. Just wondering is there other modification you have made to the code? Plus, what's the filelist you are using? The filelist I used is from the original UVC repo and it contains 268 lines. Any help would be appreciated. Thanks!

Hi,
Where can I find your VIP_vallist.txt ? Also Did you use the VIP_Fine from this repo
Thanks