RQuispeC / pytorch-ACSCP

Unofficial implementation of "Crowd Counting via Adversarial Cross-Scale Consistency Pursuit" with pytorch - CVPR 2018

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

result of shanghai_ partA

EillotY opened this issue · comments

here is my result of shanghai_partA with people_thr:0 and other sets are below:
image
the result is worse than the before one with people_thr:20 which has MAE:91 .I suppose the original code has some problems or the author publish the fault MAE.
thanks.

thanks for sharing your results, can you try training for more epochs? it would interesting to see how it behaves.
I remember that MAE and MSE metrics used to have a lot of variance while training (that's why your best epoch is 283 after training for 500 epochs), this may be because the network is trained to minimize the diference between density maps and not people count directly (MAE and MSE are based on people count). Another thing that may have influence is that my implementation uses FIXED KERNEL for ground truth generation, this may have further effects as people in shanghai-tech have bigger scales than in ucf-cc-50.

ok ,i will consider these problems.thanks .

hi ,i find increasing epochs can not get a better result, according to the picture below can see the best epoch is 181, the training just add epochs to 1000, Currently we can see is epoch 832, so i think i can not get a better result in remaining epochs. Your suggestion is changing the FIXED KERNEL for ground truth generation, currently i can not figure out why, can you please explain the reason for me ? Thanks a lot.
WechatIMG153

If you check https://github.com/RQuispeC/pytorch-ACSCP/blob/master/manage_data/get_density_map.py you can find this line H = gauss_ker(SIGMA, [kernel_size_y, kernel_size_x]), by default I set kernel_size_y = kernel_size_x = 15, that basically means that the ground truth uses gaussians of size [15x15]. Shanghai-Tech's images have big heads (probably more than [15x15]) so when you create the ground truth using only [15x15] it introduces noise for model learning, mainly because there are parts of the head that are not being labeled as positives areas.

Hope this was clear

ok, thanks a lot.

I remember I have an implementation of generation of ground truths using VARIABLE kernel size based on KNN and a method we proposed, you may want to check it out

https://github.com/RQuispeC/multi-stream-crowd-counting-extended/blob/master/manage_data/get_density_map.py

I noticed the get_density_map.py ,but i find the default mode by gt_mode is 'same'. So i want to figure out whether the mode 'same' is not suitable for Shanghai_part_A.
Oh ,i see, the code has updated ,the previous code i have is the stable mode:'same'., so , to shanghai_partA i should use knn mode to create ground truth? And i noticed the another mode is 'face', i have not figure it out temporarily.And another idea is that could i use the knn mode in ucc_50, though the default size is [15,15],it may exists a more proper size to create density_map to train in the view of the more accurate the density map, the more accurate the result .Can it worth be trying?

Thanks very much!

that code I send you is from another work we published https://arxiv.org/ftp/arxiv/papers/2002/2002.09951.pdf , we proposed a new method to create ground truths and compare to 'same' and 'knn', IN AVERAGE 'face' is better but results may not be stable and change between models, we didn't test with ACSCP.
you can check the full code https://github.com/RQuispeC/multi-stream-crowd-counting-extended it is really similar to this repo