Performance of PAA under 1x without Score Voting.

Question

Performance of PAA under 1x without Score Voting.

Joker316701882 opened this issue 4 years ago · comments

@kkhoot
Hi, thank you for the great job!

However, I'm curious about the performance of the proposed PAA under 1x setting without the Score Voting. In my opinion, Score Voting is a common trick which can be simply applied to ATSS/FCOS. Also comparing to 1.5x, 1x is more frequently reported in other paper. So could you release the performance(together with ckpt is better!) of PAA under 1x setting without Score voting so that other followers can clearly get how much gain the PAA achieves over ATSS?

Thank you again.

Kang Kim · Answer 1 · Wed Jul 22 2020 12:55:00 GMT+0800 (China Standard Time)

Hi, thanks for your suggestion!
Currently I have no access to enough GPU resources (I left the company where I did this work), so I will ask my former colleague to do the experiment. Regarding the checkpoint, the company is more strict in releasing any code or checkpoint to outside, so it may take some more time (this is actually the reason why all the checkpoints are not available here for now). But I will try to provide more checkpoints.
The score voting in our paper combines the gaussian-like score and the anchor score, so it is not directly applicable to ATSS. ATSS might be able to use the centerness score instead of the IoU prediction score, but I am not sure about the performance. As noted in our paper, you can do voting only using the gaussian score instead.
In my experience with the score voting, it usually gives 0.1~0.3 AP improvements, so I think it is safe to assume that the performance of PAA without score voting is this amount less in general.

Kang Kim · Answer 2 · Wed Jul 22 2020 12:57:29 GMT+0800 (China Standard Time)

I also would like to mention that 1x setting does not give fully converged training results on COCO. So 1x results can be noisier compared to 1.5x or 2x settings.

Ge Zheng · Answer 3 · Wed Jul 22 2020 13:13:37 GMT+0800 (China Standard Time)

@kkhoot Thank you for you kind reply. I just successfully compiled this project and ready to train.

I also would like to mention that 1x setting does not give fully converged training results on COCO. So 1x results can be noisier compared to 1.5x or 2x settings.

This is a valuable point and I now can fully understand why 1.5x is used in PAA.

Just a mention, I think IoU prediction layer proposed in this paper is similar to the soft-IoU layer proposed in https://arxiv.org/pdf/1904.00853.pdf. Although thare are some minor differences, but the major part "predicting the IoU between pred_bbox and its corresponding GT" is the same I guess. What do you think?

Kang Kim · Answer 4 · Wed Jul 22 2020 13:26:42 GMT+0800 (China Standard Time)

@Joker316701882 Thanks for the reference. I have not noticed this work. It seems that their soft-IoU layer is similar to ours. However this IoU prediction was proposed earlier at YOLO papers as noted in the related work session in our paper. So predicting IoU between a predicted box and its GT box is not new. But bringing this again, replacing the centerness predction and relating it to matching the objectives of the training and testing procedures is what we think a part of our contribution :)

Ge Zheng · Answer 5 · Wed Jul 22 2020 13:27:52 GMT+0800 (China Standard Time)

@kkhoot Thank you for your kind reply again.

Ge Zheng · Answer 6 · Thu Jul 23 2020 12:06:45 GMT+0800 (China Standard Time)

@kkhoot
Hello. I just finished training a R50 1x. However, I only got 31.4mAP with the default training setting you provided. I uploaded the log.txt to this url https://github.com/Joker316701882/DL-Paper-Recommendation/blob/master/log.txt
The yaml I used is configs/paa/paa_R_50_FPN_1x.yaml. The command is exactly the same with in README. I didn't modify any other part of the code. Any idea?

Kang Kim · Answer 7 · Thu Jul 23 2020 12:19:13 GMT+0800 (China Standard Time)

Hi, did you modify any of the code? Seems that you are using torch 1.5, which the maskrcnn-benchmark is not supporting. The code has been confirmed to work well with torch <= 1.4. If you share your code, then I can look at it.

Ge Zheng · Answer 8 · Thu Jul 23 2020 12:47:28 GMT+0800 (China Standard Time)

@kkhoot
I did not modify any code. Just clone, compile and then run. I'll try torch1.4 and report the performance here : )

Kang Kim · Answer 9 · Thu Jul 23 2020 13:13:07 GMT+0800 (China Standard Time)

The code should give errors when it compiles its cuda modules with torch 1.5. If you didn't see any error with torch 1.5, I am not sure how the things went.

Shilong Zhang · Answer 10 · Thu Jul 23 2020 22:19:31 GMT+0800 (China Standard Time)

@kkhoot
I did not modify any code. Just clone, compile and then run. I'll try torch1.4 and report the performance here : )

Hi, as far as I know, pytorch1.5 is not stable enough for most codebase, I try to reduce the results with pt1.3, and I conducted the experiments several times, and get ~40.4 1X, and ~41 1.5x.

About score voting, It only boost ~0.1 mAP which is consist with paper's ablation study, and I believe this is not the major contribution of paper.

Ge Zheng · Answer 11 · Fri Jul 24 2020 11:01:41 GMT+0800 (China Standard Time)

@jshilong Thank you for your reply. However, the code I run in torch1.4 only achieved 31.0mAP. I guess there is something wrong with my pretrained model. (my own MSRA/R-50.pkl). Maybe minor difference between mine and the required R-50(normalization parameters or something else). I'll check and report the results here. By the way, in my knowledge, it is quite hard to achieve 40.7mAP(as reported in README - R50 1x) with R-50 1x by only changing label assignment strategy. So I hope the author can train a stable R-50 1x with Score Voting(let's say 0.1~0.2mAP improvement) and update the latest performance as well as the corresponding ckpt in this repo.

Kang Kim · Answer 12 · Fri Jul 24 2020 12:00:44 GMT+0800 (China Standard Time)

@Joker316701882 @jshilong Thanks for sharing your training results here. If it is hard to get near 40.7 with R50 1x setting, then it is better to change the performance in README. I ran R50x 1x setting before I left the company to provide a reference number here. Maybe I was lucky to get that amount number in a single try :) My current company is now buying GPU servers, so I think I can run the experiment in few weeks. I will share the result and checkpoint as soon as it is done.

Ge Zheng · Answer 13 · Mon Jul 27 2020 11:50:52 GMT+0800 (China Standard Time)

@kkhoot @jshilong Hi, I've done R50 1x three times this weekend, here is the result w/o and w/ Score Voting:

Exp1. 40.5-40.7
Exp2. 40.3-40.5
Exp3. 40.4-40.5

According to the comments of @jshilong , I believe the stable performance of PAA 1x without score voting is [40.3, 40.4].

Bests.

Kang Kim · Answer 14 · Mon Jul 27 2020 14:05:49 GMT+0800 (China Standard Time)

@Joker316701882 Thanks for sharing your results!